Unify and manage your data

Suppress token generation using ignoreInToken

Learn about using the ignoreInToken element to suppress token generation for attributes that don't improve match candidate quality or system performance.

What is ignoreInToken?

The ignoreInToken element is used to suppress token generation for selected attributes in a match rule. You use it when you determine that tokens for those attributes would not contribute meaningful value when identifying match candidates — and might actually degrade performance.

Technically, ignoreInToken is optional. However, in practice, it is widely used and should be considered a best practice in most match rule designs.

Why use ignoreInToken?

If you don't explicitly assign a token class to an attribute, Reltio will assign one by default. This ensures token generation — but the system's default choice may produce excessive or low-quality candidates.

To avoid this:

  • If you want tokens generated → explicitly assign a token class.
  • If you don't want tokens generated → list the attribute in ignoreInToken.

When should you use ignoreInToken?

You should consider ignoreInToken in several common scenarios:
When using the notEquals operator
If you want to compare records that do not have a specific value, it's counterproductive to generate tokens for that value. Tokens aim to pull in similar profiles — not exclude them. In this case, suppressing tokens avoids inefficiency and irrelevant comparisons.
When token cardinality is too high

Consider a simplified example:

  • You have 10 million consumer profiles from 6 data sources.
  • Attributes include: Full Name, Phone, Address, and SSN.
  • There might be 10,000 people named "John Smith".

If your comparison strategy requires exact SSN or exact Phone, those attributes are highly unique — tokenizing them helps you retrieve only a few likely candidates.

If you instead tokenize Full Name, you might get 10,000 profiles — but only 6 will pass comparison. That wastes compute, increases false positives, and may hit token limits.

Key insight

Name is useful for comparison, but inefficient for tokenization in large populations. In this case, add it to ignoreInToken.
"rule": {
  "ignoreInToken": [
    "configuration/entityTypes/HCP/attributes/FullName"
  ]
}
When using the DistinctWordsComparator

Normally, if you use the DistinctWordsComparator, you might also use DistinctWordsMatchToken for tokenization. However, that token class often generates a very large number of tokens, which can clutter the system and harm performance.

Best practice is to use ignoreInToken with DistinctWordsComparator unless you've tuned the match token settings carefully and validated their effectiveness.
Important:

Don't automatically exclude attributes from tokenization. Base your decision on:

  • Match rule structure and comparator logic
  • Token variability and collision analysis
  • Careful review of token class behavior (if using DistinctWordsMatchToken)

Removing high-signal attributes from tokenization without validation may result in missed matches or overly broad candidate pools.

JSON example: ignoreInToken configuration

Here's a complete example showing ignoreInToken used in a match rule:
{
  "exact": [
    "configuration/entityTypes/Contact/attributes/LastName"
  ],
  "comparatorClasses": {
    "mapping": [
      {
        "attribute": "configuration/entityTypes/Contact/attributes/LastName",
        "class": "com.reltio.match.comparator.BasicStringComparator"
      }
    ]
  },
  "matchTokenClasses": {
    "mapping": [
      {
        "attribute": "configuration/entityTypes/Contact/attributes/LastName",
        "class": "com.reltio.match.token.ExactMatchToken"
      }
    ]
  },
  "ignoreInToken": [
    "configuration/entityTypes/Contact/attributes/LastName"
  ]
}

In this rule:

  • The LastName attribute is used for comparison, but
  • It's excluded from token generation using ignoreInToken