Name Dictionary Cleanser
The Name Dictionary Cleanser provides a list of synonyms to be used during matching.
Typically used for the First name field of a person’s profile, the Name Dictionary Cleanser leverages a list of names (sometimes called a synonym dictionary) enabling the match engine to understand that a specific collection of names should be treated as semantically identical. You can configure your match rule to use the in-built dictionary provided by Reltio or use your own dictionary.
How the Name Dictionary Works
Whether you’re using Reltio’s in-built synonym list or a list you have developed on your own, the format of the source file is the same. It is an Excel file where each row defines a base name followed by all of the valid synonyms for the base.
Here is an example of a row from the in-built Reltio Name Dictionary source file. You can download the CSV version of the file.
If you have enabled the Name Dictionary and mapped it for example to the First Name attribute, then when the match engine tokenizes the First Name attribute in an entity, it looks to see if the name in the attribute can be found in any of the rows in the RDM Lookup type that you have loaded from your source file. If it does, it takes the base name from the row (‘adelaide’ in this example) and places it into the token structure for that entity. By doing so, all entities that contain the names in the Excel row will contain the common token, “adelaide”), thus all will be considered as match candidates of each other (assuming the other components of the resulting token phrases also support the match candidates associations.)
Using the Reltio Name Dictionary Cleanser
If you wish to use the Reltio Name Dictionary Cleanser, it is available simply with the console-based Match Ruled Editor as shown below.
Alternatively, if you are crafting match rules using a JSON editor then simply
                include the following cleanseAdapter JSON within the cleanse
                element of your match rule configuration.
"cleanse": [
   {
        "cleanseAdapter": "com.reltio.cleanse.impl.NameDictionaryCleanser",
        "mappings": [
            {
                "attribute": "configuration/entityTypes/Individual/attributes/FirstName",
                "cleanseAttribute": "configuration/entityTypes/Individual/attributes/FirstName"
            }
        ]
    }
]
            The Reltio in-built dictionary uses an internal list of synonyms specifically applicable for North American first names and is not editable. It was most recently updated in Oct 2020. You can download the CSV version of it as discussed above. If you wish to use an edited version of the list or use your own list, see Using a Custom Name Dictionary.
You can easily test the in-built dictionary by creating a simple
                    suspect match rule that, for simplicity only includes the
                    FirstName attribute using the
                    BasicStringComparator comparator and the
                    ExactMatchToken class. If you are editing your tenant L3,
                Include the cleanseAdapter as shown above and do not include
                    ignoreInToken. Create two records in your tenant (you can do
                this easily in the Hub UI) one that has Benedict for the first name while the
                other has Ben. (Note that just by creating the two records, they will be
                cleansed and tokenized). If the UI shows them as potential matches, then the
                in-built dictionary is working for you.
Utilizing Updated Versions of the In-built Name Dictionary
The Name Dictionary function was originally created in 2012 and populated at that
                time with a set of synonyms. In September 2020, an updated list of synonyms was made
                available and we anticipate further updates over time. In order to allow you to
                choose which synonym list to leverage, a new section called
                    cleanseAdapterParams has been added to the definition of the
                    NameDictionaryCleanser cleanser inside the L3 configuration as
                shown below. If you are using the original synonym list from 2012 and see no reason
                to change, you do not need to do anything to your configuration. If you wish to use
                a newer version such as the September 2020 version, then add the new parameter to
                your Cleanse Adapter.
"cleanseAdapterParams": {
"dictionary": "SynonymFirstNamesNA_2020-09-01",
"keepOriginalValue": "false"
}
            Notice it enables you to specify which version you wish the match engine to use. Once you modify and save this configuration to your tenant, you will need to generate a new match table for any tenant you wish updated with the revised synonyms.
If you do not specify any dictionary, then the old dictionary is set by default. In
                the above example, the new dictionary
                SynonymFirstNamesNA_2020-09-01 is set. The following table
                explains the new parameters:
| Parameter | Values | Description | 
|---|---|---|
dictionary | SynonymFirstNamesNA_2012-06-01 or
                                SynonymFirstNamesNA_2020-09-01
                             | The string value that specifies the dictionary to be used by the
                                cleanser. If you do not specify a dictionary, then the the old
                                dictionary SynonymFirstNamesNA_2012-06-01 is set by
                                default. | 
keepOriginalValue | true or false | A Boolean value that specifies whether the original value must be used for tokenization and comparison, in addition to the value obtained from the dictionary. | 
You can download both the dictionaries by clicking the relevant links: