Suppress token generation using ignoreInToken
Learn about using the ignoreInToken element to suppress token generation for attributes that don't improve match candidate quality or system performance.
What is ignoreInToken?
The ignoreInToken element is used to suppress token generation for selected attributes in a match rule. You use it when you determine that tokens for those attributes would not contribute meaningful value when identifying match candidates — and might actually degrade performance.
Technically, ignoreInToken is optional. However, in practice, it is widely used and should be considered a best practice in most match rule designs.
Why use ignoreInToken?
If you don't explicitly assign a token class to an attribute, Reltio will assign one by default. This ensures token generation — but the system's default choice may produce excessive or low-quality candidates.
To avoid this:
- If you want tokens generated → explicitly assign a token class.
- If you don't want tokens generated → list the attribute in
ignoreInToken.
When should you use ignoreInToken?
ignoreInToken in several common scenarios:- When using the
notEqualsoperator - If you want to compare records that do not have a specific value, it's counterproductive to generate tokens for that value. Tokens aim to pull in similar profiles — not exclude them. In this case, suppressing tokens avoids inefficiency and irrelevant comparisons.
- When token cardinality is too high
-
Consider a simplified example:
- You have 10 million consumer profiles from 6 data sources.
- Attributes include: Full Name, Phone, Address, and SSN.
- There might be 10,000 people named "John Smith".
If your comparison strategy requires exact SSN or exact Phone, those attributes are highly unique — tokenizing them helps you retrieve only a few likely candidates.
If you instead tokenize
Full Name, you might get 10,000 profiles — but only 6 will pass comparison. That wastes compute, increases false positives, and may hit token limits.Key insight
Name is useful for comparison, but inefficient for tokenization in large populations. In this case, add it toignoreInToken."rule": { "ignoreInToken": [ "configuration/entityTypes/HCP/attributes/FullName" ] } - When using the
DistinctWordsComparator -
Normally, if you use the
DistinctWordsComparator, you might also useDistinctWordsMatchTokenfor tokenization. However, that token class often generates a very large number of tokens, which can clutter the system and harm performance.Best practice is to useignoreInTokenwithDistinctWordsComparatorunless you've tuned the match token settings carefully and validated their effectiveness.Important:Don't automatically exclude attributes from tokenization. Base your decision on:
- Match rule structure and comparator logic
- Token variability and collision analysis
- Careful review of token class behavior (if using
DistinctWordsMatchToken)
Removing high-signal attributes from tokenization without validation may result in missed matches or overly broad candidate pools.
JSON example: ignoreInToken configuration
ignoreInToken used in a match rule:{
"exact": [
"configuration/entityTypes/Contact/attributes/LastName"
],
"comparatorClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Contact/attributes/LastName",
"class": "com.reltio.match.comparator.BasicStringComparator"
}
]
},
"matchTokenClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Contact/attributes/LastName",
"class": "com.reltio.match.token.ExactMatchToken"
}
]
},
"ignoreInToken": [
"configuration/entityTypes/Contact/attributes/LastName"
]
}In this rule:
- The
LastNameattribute is used for comparison, but - It's excluded from token generation using
ignoreInToken