Additional Topics on Matching
To have a complete understanding of matching at Reltio, you must familiarize yourself with some additional topics on matching.
The following sections provide more information about the additional topics related to matching.
Tenant-Level Match Strategy
At the tenant level there is a parameter that sets what is called the Match Strategy
of the tenant. Not to be confused with the incremental strategy that guides the design of
your match rules, the tenant-level Match Strategy parameter controls the operational
behavior of the match engine as it exercises your match rules. It can be set to the
following values, each explained below. The default setting is
INCREMENTAL_V2
.
NONE
- Matching is disabled. Match & DTSS related tasks will not
occur on any of the configured entity types.
INCREMENTAL
- Any update to an entity causes the entity to be
rematched on the fly. Merge-on-the fly is enabled by default at the tenant level.
INCREMENTAL_V2
- Same behavior as INCREMENTAL
except that Merge-on-the fly is disabled by default at the tenant level.
ON_REQUEST
- Matching only occurs when invoked as a batch job.
To change the tenant-level Match Strategy parameter, you must file a support ticket
to support@reltio.com with the requested change. For example, Please set
matchingConfiguration
strategy to ON_REQUEST
.
Ignoring Diacritical Marks
stripDiacritics
- When the stripDiacritics
parameter is
enabled, diacritical marks in words are ignored. This function provides improved match
results with international data sets where diacritical marks on characters are common.
Examples of diacritical marks include apostrophe, cedilla, tilde, circumflex, or macron, and
so on. For example, the string "Praça Dr João Mendes São Paulo" will be transformed to
"Praca Dr Joao Mendes Sao Paulo".
This parameter is disabled by default, contact support if you want this enabled.
Matching on non-Latin Characters
Matching is performed by the comparators and the token generators help find candidate profiles. For matching on non-Latin character sets, it is important to know which comparator classes and token generator classes support these sets. For more information, see the table of comparators and table of token generator classes for details on each one.
Including a Source Name in a Match Rule
Sometimes, it is desirable to restrict pairs of profiles being matched to have come from the same source. For example, I only want match rule #3 to evaluate records sourced from SAP to others sourced from SAP. There are options on how to accomplish this.
Using the Crosswalk Information
This approach is supported directly within the match configuration framework. You can
simply add an Exact
Comparison Operator, coupled with the
Equals
Helper Operator. See example section below:
"uri": "configuration/entityTypes/Individual/matchGroups/Rule3",
"label": "Rule3",
"type": "suspect",
"rule": {
"and": {
"exact": [
"configuration/sources"
],
"equals": [
{
"uri": "configuration/sources",
"value": "SAP"
}
]
}
}
However, be aware that since a profile that is already merged can contain multiple crosswalks, what this rule does is find profiles that have at least a crosswalk from the SAP source. It does not mean that the profile only has the SAP crosswalk. Thus a more accurate interpretation of this approach is - I only want match rule #3 to evaluate records that have a contribution from the SAP source.
Using a Custom Attribute
In this approach, you have the ability to ensure that a successful match is performed
on records that ONLY have contributions from specific sources. To do this, you would create
an additional attribute within the entity type definition, perhaps called
recordSource
. And you would augment your integration to POST the name of
the source to this attribute. When this record is first posted from the SAP source, the
attribute will contain SAP. Of course, as other records merge into it, it might
accumulate additional values, perhaps Workday and Oracle. Thus, if you wish
to, then ensure that your rule only considers records that have contributions from a defined
set of sources and no other sources, you could add the operators Equals
and
notEquals
to your rule, effectively establishing a white list and black
list of sources for the record.
Matching on the Proximity of Locations
In some cases, the location of the entity type being matched is not a postal address. Consider the case of matching oil wells whose location is described by a longitude and latitude. In this case, the best way to use this location information when matching the oil well profiles is to leverage proximity matching for the location part of the oil well. See the JSON below. In this example, the two locations are considered the same if they are within 0.4 miles of each other.
{
"uri": "configuration/entityTypes/Location/matchGroups/ProximityMatch",
"label": "Proximity match on LatLongs",
"type": "suspect",
"useOvOnly": "true",
"rule": {
"matchTokenClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/LatLong",
"parameters": [
{
"parameter": "distance_miles",
"value": "0.1"
}
],
"class": "com.reltio.match.token.ProximateGeoToken"
}
]
},
"comparatorClasses": {
"mapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/LatLong",
"parameters": [
{
"parameter": "distance_miles",
"value": "0.1"
}
],
"class": "com.reltio.match.comparator.ProximateGeoComparator"
}
]
},
"multi": [
{
"uri": "configuration/entityTypes/Location/attributes/LatLong",
"attributes": [
"configuration/entityTypes/Location/attributes/GeoLocation/attributes/Latitude",
"configuration/entityTypes/Location/attributes/GeoLocation/attributes/Longitude"
]
}
]
}
}
.001
to 10.0
. ProximateGeoToken
failed to build match tokens for
negative longitudes. The newly created tenants use the fixed version of
ProximateGeoToken
. For the tenants that already use
ProximateGeoToken
, the old match tokens class will be retained to
preserve compatibility. If you have any pending potential matches that were generated by a
rule that uses GeoLocation
matching, you must run the reindex task
to regenerate match tokens. To migrate to the fixed version, contact Reltio Support. Cross-Attribute Matching
Cross-Attribute matching offers the capability to compare the combined values from multiple attributes of a profile, against those from another profile. Why is this important? While this can be utilized for any collection of attributes you select, it most often is useful for the First and Last name of a person’s profile, as shown in this example for profiles A and B.
Profile A:
First name = “John”; Last name = “Smith”; Address Line1 = “123 Main St”, and
so on
Profile B:
First name = “Smith”; Last name = “John”; Address Line1 = “123 Main St”, and
so on
To match on profiles that have this characteristic, you will construct a virtual
attribute that is defined as the combination of the actual attributes that you believe will
contain each other’s values. In this case, you would create a virtual attribute, perhaps
MultiGroup1
defined as the combination of First and Last name. The next
step is to define a token class for the virtual attribute that generates tokens appropriate
for the type of data. Thus, the tokens generated for MultiGroup1
are the
same for both profiles. Lastly, choose a comparator appropriate for the type of data being
compared.