Additional topics on matching
Learn more about matching.
For a complete understanding of matching at Reltio, familiarize yourself with some additional topics on matching. The following sections give information on those topics.
Tenant-level match strategy
At the tenant level, you can change the Match Strategy of the tenant using the Match Strategy parameter. This is different from the incremental strategy that guides the design of your match rules. The tenant-level Match Strategy parameter controls the operational behavior of the match engine as it uses your match rules. The default setting is INCREMENTAL_V2
. The possible values are:
NONE
- Matching is disabled. Match and DTSS-related tasks won't be done on any of the configured entity types.
INCREMENTAL
- Any update to an entity causes the entity to be rematched immediately. Merge-on-the fly is enabled by default, at the tenant level.
INCREMENTAL_V2
- Same behavior as INCREMENTAL
except that Merge-on-the fly is disabled by default, at the tenant level.
ON_REQUEST
- This matching strategy prepares the tenant for external matching using:- External Match Task API. For more information, see topic External match API.
- POST _matches API. For more information, see topic Search For Potential Matches for Entity Specified in JSON .
- POST _scoredMatches API. For more information, see topic Search for potential matches for entity specified in JSON with scoring.
This matching strategy prepares the tenant for external matching using specific APIs. This strategy affects the responses of certain APIs, which may return empty or outdated information. When the matching strategy is set to ON_REQUEST
, the responses of the GET _matches and GET _transitiveMatches APIs are affected - the responses usually return the existing potential matches, but in this case the responses may be empty or may contain outdated information. The Rebuild Match Table task rebuilds the relevant match tables but won't create, update, or delete potential matches or perform automatic merges based on the existing or relevance-based match rules.
To change the tenant-level Match Strategy parameter, file a support ticket to support@reltio.com with the required change. For example, set matchingConfiguration
strategy to ON_REQUEST
.
Ignoring diacritical marks
stripDiacritics
- When the stripDiacritics
parameter is enabled, diacritical marks in words are ignored. This function provides improved match results with international data sets where diacritical marks on characters are common. Examples of diacritical marks include apostrophe, cedilla, tilde, circumflex, or macron. For example, the string "Praça Dr João Mendes São Paulo" is transformed to "Praca Dr Joao Mendes Sao Paulo".
This parameter is disabled by default. To enable it, contact support.
Matching on non-Latin characters
Matching is performed by the comparators and the token generators help find candidate profiles. For matching on non-Latin character sets, it is important to know which comparator classes and token generator classes support these sets. For more information, see the table of comparators and table of token generator classes for details on each one.
Including a source name in a match rule
Sometimes, you may need to restrict pairs of matched profiles to come from the same source. For example, only match rule #3 is used to evaluate records sourced from SAP to others sourced from SAP.
Using the crosswalk information
This approach is supported directly within the match configuration framework. You can simply add an Exact
Comparison Operator, coupled with the Equals
Helper Operator. See example section below:
"uri": "configuration/entityTypes/Individual/matchGroups/Rule3",
"label": "Rule3",
"type": "suspect",
"rule": {
"and": {
"exact": [
"configuration/sources"
],
"equals": [
{
"uri": "configuration/sources",
"value": "SAP"
}
]
}
}
Since a profile that is already merged can contain multiple crosswalks, what this rule does is find profiles that have at least a crosswalk from the SAP source. It doesn't mean that the profile only has the SAP crosswalk. Thus a more accurate interpretation of this approach is that only match rule #3 is used to evaluate records that have a contribution from the SAP source.
Using a custom attribute
In this approach, you can ensure that a successful match is performed on records that ONLY have contributions from specific sources. To do this, create an additional attribute within the entity type definition, like recordSource
and augment your integration to POST the name of the source to this attribute. When this record is first posted from the SAP source, the attribute will contain SAP. Of course, as other records merge into it, it can accumulate additional values. For example, Workday and Oracle. To ensure that your rule only considers records that have contributions from a defined set of sources and no other sources, add the operators Equals
and notEquals
to your rule, effectively establishing an acceptlist and blocklist of sources for the record.
Matching on the proximity of locations
In some cases, the location of the entity type being matched isn't a postal address. Consider the case of matching oil wells whose location is described by a longitude and latitude. In this case, the best way to use this location information when matching the oil well profiles is to leverage proximity matching for the location part of the oil well. In the following sample JSON, the two locations are considered the same if they are within 0.4 miles of each other.
{
"uri": "configuration/entityTypes/Location/matchGroups/ProximityMatch",
"label": "Proximity match on LatLongs",
"type": "suspect",
"useOvOnly": "true",
"rule": {
"matchTokenClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/LatLong",
"parameters": [
{
"parameter": "distance_miles",
"value": "0.1"
}
],
"class": "com.reltio.match.token.ProximateGeoToken"
}
]
},
"comparatorClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/LatLong",
"parameters": [
{
"parameter": "distance_miles",
"value": "0.1"
}
],
"class": "com.reltio.match.comparator.ProximateGeoComparator"
}
]
},
"multi": [
{
"uri": "configuration/entityTypes/Location/attributes/LatLong",
"attributes": [
"configuration/entityTypes/Location/attributes/GeoLocation/attributes/Latitude",
"configuration/entityTypes/Location/attributes/GeoLocation/attributes/Longitude"
]
}
]
}
}
.001
to 10.0
.ProximateGeoToken
failed to build match tokens for negative longitudes. The newly created tenants use the fixed version of ProximateGeoToken
. For the tenants that already use ProximateGeoToken
, the old match tokens class is retained to preserve compatibility. If you have any pending potential matches that were generated by a rule that uses GeoLocation
matching, you must run the reindex task to regenerate match tokens. To migrate to the fixed version, contact Reltio Support.Nested-attribute matching
In nested-attribute matching, nested attributes like Address and Phone are compared as whole values, not by mixing sub-attributes.
For example, a match rule that matches by condition: exact (AddressType) and exact (AddressLine1):
- Entity 1: Home at "123 Main St" and Office at "987 South St".
- Entity 2: Office at "123 Main St".
Here's how the match rule works:
- Comparison 1: Home "123 Main St" (Entity 1) vs. Office "123 Main St" (Entity 2) – No match (different AddressType).
- Comparison 2: Office "987 South St" (Entity 1) vs. Office "123 Main St" (Entity 2) – No match (different AddressLine1).
So, Entity 1 and Entity 2 do not match.
However, if your match rule has logical operands (e.g. and, or) dividing sub-attributes then the cross-permutation happens. This occurs because if sub-attributes are placed under different logical operands, they are considered independent. To avoid cross-permutations ensure, when you're logical operands, that they're at the top of the rule's construction, and combine all sub-attributes under the logical operands.
How cross-permutations happen | How to avoid cross-permutations |
---|---|
|
|
How cross-permutations happen | How to avoid cross-permutations |
---|---|
|
|
Cross-attribute matching
Cross-attribute matching offers the capability to compare the combined values from multiple attributes of a profile, against those from another profile. Why is this important? While this can be used for any collection of attributes you select, it most often is useful for First and Family names of a person’s profile, as shown in this example for profiles A and B.
Profile A:
First name = “John”; Family name = “Smith”; Address Line1 = “123 Main St”, and so on
Profile B:
First name = “Smith”; Family name = “John”; Address Line1 = “123 Main St”, and so on
-
Create a virtual attribute, like
MultiGroup1
, defined as the combination of First and Family Name. -
Define a token class for the virtual attribute that generates tokens appropriate for the type of data.
-
Choose a comparator appropriate for the type of data being compared.
Default matching values in tenant physical configuration
By default, the value for resolveLookupStrategy is LOOKUP_CODE if it isn't specified in your matching configuration. This ensures that the lookup attribute values used for calculating the survivorship values are filtered correctly. The other values for resolveLookupStrategy are LOOKUP_VALUE and NONE. If the values are LOOKUP_CODE or LOOKUP_VALUE, then entities aren't merged; if the value is NONE, then entities aren't merged.
{
"type": "PLATFORM",
"tenantId": "johndoe",
"tenantName": "johndoe",
"customerName": "Reltio",
...
"matchingConfiguration":
{
"strategy": "INCREMENTAL_V2",
"resolveLookupStrategy": "LOOKUP_CODE",
"generateMatchTokensMapping": false,
"generateTokensForExactOrAllNull": false,
"generateSuspectByNegativeRules": true,
"stripDiacritics": false,
"hashTokens": true,
"tokenCollisionsLimit": 300
}
...
}