When entering a new lead or supplier request record, it could be that the same record already exists in the database. In such situations the new record can be treated as a duplicate. The duplicate check is a process that runs automatically in the background and identifies similar records that exists in the database. This activity is used to add rules to be used in the duplicate search. There are some rules delivered by default.
A certain rule can be configured as a single hit or otherwise as a combined hit.
Algorithms:
It is
possible to define different types of algorithms for different columns
in the rule, to search for duplicates. Following are some details about
algorithms;
Exact:
This will compare two values and return
either 0 or 1.
Distance:
Uses oracle function UTL_MATCH.EDIT_DISTANCE.
This compares two values and returns the distance between them. Distance
is measured in number of insertion/deletion/substitution. A way of
quantifying how dissimilar two strings are to one another by counting
the minimum number of operations required to transform one string into
the other. An operation can be an insertion, deletion or
substitution.
Examples:
“Michael” vs “Michae” will result
in 1 (had to insert a “l” at the end)
“Michael” vs “Michaell”
will result in 1 (had to delete a “l” at the end)
“Michael” vs
“Nichael” will result in 1 (had to substitute a “M” to a “N” at the
beginning)
Fuzzy:
Uses oracle function
UTL_MATCH.JARO_WINKLER_SIMILARITY. This calculates the measure of
agreement between two strings and returns a score between 0 (no match)
and 100 (perfect match).
There are no prerequisites for this activity.
N/A