Match Strategies for the Most Common Attributes
You can use the strategies outlined below to match commonly used attributes.
The following sections explain the match strategy for the most commonly used attribtues.
Person Related Attributes
First Name
There are various tactics that can be used on the First Name.
Exact matching of First Name (and/or LastName)
- Recommended Comparison Operator:
Exact
- Recommended Comparator class:
BasicStringComparator
- Recommended Token Generator class:
ExactMatchToken
Fuzzy Matching on First Name
When thinking about fuzzy matching for the First Name attribute, there are at least
two tactics that can be employed which you should consider. First, if your objective
is to successfully find and match cases of first names or last names that have
misspellings, you could use the DamerauLevenshteinDistance
or the
DynamicDamerauLevenshteinDistance
comparator, coupled with the
FuzzyTextAndNumberMatchToken
. Another tactic would be to use
the DoubleMetaphoneComparator
and
DoubleMetaphoneMatchToken
which uses a phonetic approach AND
takes common misspellings into account automatically. Lastly if you wish to employ
either of the previous suggestions but also match across common synonyms, then you
can add the use of the Name Dictionary Cleanser to your rule.
Full Name
This is a good tactic if you want to avoid matching on first and last names independently. You can use the profile-level cleanser to form the Full Name as a concatenation of the First and Last names if the full name is not available to you directly.
Since the combined name will naturally increase the statistical amount of variation into the match process, you should use the Fuzzy comparison operator and choose fuzzy comparator and token classes as you see fit.
Phone Number
The following table lists the recommended comparator and token generator classes:
Recommended | Class |
---|---|
Recommended comparator class | PhoneNumberComparator |
Recommended token generator class | PhoneNumberMatchToken |
U.S. Social Security Number and other similar Identifier numbers
The following table lists the recommended comparator and token generator classes:
Recommended | Class |
---|---|
Recommended comparator class | BasicStringComparator or DynamicDamerauLevenshteinDistance |
Recommended token generator class | ExactMatchToken |
Best ractice guidance | Regex can be used to remove special characters from IDs before comparison and tokenization. |
Gender
The following table lists the recommended comparison operator, comparator class, and token generator class:
Recommended | Class |
---|---|
Recommended comparison operator | Exact
|
Recommended comparator class | BasicStringComparator
|
Recommended token generator class | (none, use ignoreIntoken to suppress
this) |
Best practice guidance | If the population of data is not extremely good, then consider
using ExactOrNull that allows for one or both
gender attributes to be <null>. |
Suffix
The following table lists the recommended comparison operator, comparator class, and token generator class:
Recommended | Class |
---|---|
Recommended comparison operator | Exact
|
Recommended comparator class | (Recommended using ignoreIntoken to suppress
this) |
Recommended token generator class | ExactMatchToken |
Best practice guidance | If the population of data is not extremely good, then consider using ExactOrNull that allows for one or both gender attributes to be <null>. Be sure to clean and standardize values like Jr, Jr., Junior, to a common value like jr. |
Organization Related Attributes
Organization Name
The following table lists the recommended comparison operator, comparator class, and token generator class:
Recommended | Class |
---|---|
Recommended comparison operator | Fuzzy
|
Recommended comparator class | OrganizationNamesComparator or
DamerauLevenshteinDistance
|
Recommended token generator class | OrganizationNameMatchToken |
Tax ID
Similar to Social Security Number (SSN) or other similar identifiers.
Other Attributes
Address
The following table lists the recommended comparison operator, comparator class and token generator class:
Recommended | Class |
---|---|
Recommended comparison operator | Fuzzy
|
Recommended comparator class | AddressLineComparator
|
Recommended token generator class | AddressLineMatchToken |
The following table lists the recommended comparison operator, comparator class and token generator class:
Recommended | Class |
---|---|
Recommended comparison operator | Fuzzy or Exact |
Recommended comparator class | BasicStringComparator or
DamerauLevenshteinDistance
|
Recommended token generator class | ExactNumberMatchToken |
Using Reference Attributes in a Match Rule
We should avoid the use of reference attributes in match rules as much as possible. Significant use of reference attributes in match rules increase performance overhead for the platform. Whenever possible, denormalization of attributes within an entity is better for performance.