Unify and manage your data

Show Page Sections

Match Strategies for the Most Common Attributes

You can use the strategies outlined below to match commonly used attributes.

The following sections explain the match strategy for the most commonly used attribtues.

First Name

There are various tactics that can be used on the First Name.

Exact matching of First Name (and/or LastName)

  • Recommended Comparison Operator: Exact
  • Recommended Comparator class: BasicStringComparator
  • Recommended Token Generator class: ExactMatchToken

Fuzzy Matching on First Name

When thinking about fuzzy matching for the First Name attribute, there are at least two tactics that can be employed which you should consider. First, if your objective is to successfully find and match cases of first names or last names that have misspellings, you could use the DamerauLevenshteinDistance or the DynamicDamerauLevenshteinDistance comparator, coupled with the FuzzyTextAndNumberMatchToken. Another tactic would be to use the DoubleMetaphoneComparator and DoubleMetaphoneMatchToken which uses a phonetic approach AND takes common misspellings into account automatically. Lastly if you wish to employ either of the previous suggestions but also match across common synonyms, then you can add the use of the Name Dictionary Cleanser to your rule.

Full Name

This is a good tactic if you want to avoid matching on first and last names independently. You can use the profile-level cleanser to form the Full Name as a concatenation of the First and Last names if the full name is not available to you directly.

Since the combined name will naturally increase the statistical amount of variation into the match process, you should use the Fuzzy comparison operator and choose fuzzy comparator and token classes as you see fit.

Phone Number

The following table lists the recommended comparator and token generator classes:

Table 1. Recommendation
Recommended Class
Recommended comparator class PhoneNumberComparator
Recommended token generator class PhoneNumberMatchToken

U.S. Social Security Number and other similar Identifier numbers

The following table lists the recommended comparator and token generator classes:

Table 2. Recommendation
Recommended Class
Recommended comparator class BasicStringComparator or DynamicDamerauLevenshteinDistance
Recommended token generator class ExactMatchToken
Best ractice guidance Regex can be used to remove special characters from IDs before comparison and tokenization.

Gender

The following table lists the recommended comparison operator, comparator class, and token generator class:

Table 3. Recommendation
Recommended Class
Recommended comparison operator Exact
Recommended comparator class BasicStringComparator
Recommended token generator class (none, use ignoreIntoken to suppress this)
Best practice guidance If the population of data is not extremely good, then consider using ExactOrNull that allows for one or both gender attributes to be <null>.

Suffix

The following table lists the recommended comparison operator, comparator class, and token generator class:

Table 4. Recommendation
Recommended Class
Recommended comparison operator Exact
Recommended comparator class (Recommended using ignoreIntoken to suppress this)
Recommended token generator class ExactMatchToken
Best practice guidance If the population of data is not extremely good, then consider using ExactOrNull that allows for one or both gender attributes to be <null>. Be sure to clean and standardize values like Jr, Jr., Junior, to a common value like jr.

Organization Name

The following table lists the recommended comparison operator, comparator class, and token generator class:

Table 5. Recommendation
Recommended Class
Recommended comparison operator Fuzzy
Recommended comparator class OrganizationNamesComparator or DamerauLevenshteinDistance
Recommended token generator class OrganizationNameMatchToken

Tax ID

Similar to Social Security Number (SSN) or other similar identifiers.

Other Attributes

Address

The following table lists the recommended comparison operator, comparator class and token generator class:

Table 6. Recommendation
Recommended Class
Recommended comparison operator Fuzzy
Recommended comparator class AddressLineComparator
Recommended token generator class AddressLineMatchToken

Email

The following table lists the recommended comparison operator, comparator class and token generator class:

Table 7. Recommendation
Recommended Class
Recommended comparison operator Fuzzy or Exact
Recommended comparator class BasicStringComparator or DamerauLevenshteinDistance
Recommended token generator class ExactNumberMatchToken

Using Reference Attributes in a Match Rule

We should avoid the use of reference attributes in match rules as much as possible. Significant use of reference attributes in match rules increase performance overhead for the platform. Whenever possible, denormalization of attributes within an entity is better for performance.