Enter Duplicate Check Rule

Explanation

When entering a new business lead record, it could be that the same record already exists in the database. In such situations the new record can be treated as a duplicate. The duplicate check is a process that runs automatically in the background and identifies similar records that exists in the database. This activity is used to add rules to be used in the duplicate search. There are some rules delivered by default.

A certain rule can be configured as a single hit or otherwise as a combined hit.

Single Hit: Only one column in the rule needs to satisfy to mark a certain record as a duplicate.

Combined Hit: All the columns in the rule needs to be satisfied to mark a certain record as a duplicate.

Algorithms:

It is possible to define different types of algorithms for different columns in the rule, to search for duplicates. Following are some details about algorithms;

Exact:
This will compare two values and return either 0 or 1.

Exact for all characters:

Make all the characters lower case, compare two values and return either 0 or 1.

Exact for numbers:

For example, a telephone number. Remove all non-digit characters, compare two values and return either 0 or 1.

Distance:
Uses oracle function UTL_MATCH.EDIT_DISTANCE. This compares two values and returns the distance between them. Distance is measured in number of insertion/deletion/substitution. A way of quantifying how dissimilar two strings are to one another by counting the minimum number of operations required to transform one string into the other.

An operation can be an insertion, deletion or substitution.

Examples:

“Michael” vs “Michae” will result in 1 (had to insert a “l” at the end)

“Michael” vs “Michaell” will result in 1 (had to delete a “l” at the end)

“Michael” vs “Nichael” will result in 1 (had to substitute a “M” to a “N” at the beginning)

Distance for all characters:

Keep the strings as they are (not making lower case before comparison) and returns the distance between two values.

Distance for numbers:

Remove all non-digit characters and returns the distance between two values.

Fuzzy:
Uses oracle function UTL_MATCH.JARO_WINKLER_SIMILARITY. This calculates the measure of agreement between two strings and returns a score between 0 (no match) and 100 (perfect match).

Fuzzy for all characters:

Make all the characters lower case, compare two values and return a number between 0 and 100.

Fuzzy for numbers:

Remove all non-digit characters, compare two values and return a number between 0 and 100.

Prerequisites

There are no prerequisites for this activity.

System Effects

N/A