Unify and manage your data

Matching on Data in Multiple Languages

Learn about the languages Reltio Match Engine supports for matching data.

Reltio Match Engine supports matching data in multiple languages by using transliteration, not translation. Transliteration converts characters from one writing system to another (for example, from Chinese or Arabic scripts into the Latin alphabet) so the Match Engine can compare values consistently across languages.

Before Reltio can match data from non-Latin languages, it transliterates the characters into a script it recognizes. Transliteration maps written characters between alphabets but doesn't translate the meaning or pronunciation of words. This ensures consistent representation of data without changing its meaning. Using a standard mapping system provides a consistent representation of data from the source language into target language.

For example, if source data is in a non-Latin character set, such as Mandarin, Reltio transliterates those characters into the Latin character set, which the Match engine uses. Configure the transliteration system to use as an attribute in an entity match group by the transliterated name.
Note: Results differ depending on when the transliteration is applied — before or after a value is split. The TransliterateCleanser applies the transliteration before the match tokens are generated or the comparator does the comparison. The comparison for non-Latin scripts runs the transliteration only after the words are split into parts. You can also run the Custom/DistinctWords/Comparator/MatchToken to apply a transliteration .

How does matching on multiple languages work?

The following example illustrates how the Match Engine can be configured. The configuration settings can be configured in your L3 layer. To do this, contact support.

{
                    "uri": "configuration/entityTypes/HCP/matchGroups/HCPbyTransLiterator",
                    "label": "HCP by transliterated Name",
                    "type": "suspect",
                    "rule": {
                        "and": {
                            "exact": [
                                "configuration/entityTypes/HCP/attributes/NonLatin_Name"
                            ],
                            "cleanse": [
                                {
                                    "cleanseAdapter": "com.reltio.cleanse.impl.TransliterateCleanser"
                                    ,
                                    "attributes": [
                                        "configuration/entityTypes/HCP/attributes/NonLatin_Name"
                                    ]
                                }
                            ]
                        },
                        "matchTokenClass": "com.reltio.match.token.ExactMatchToken"
                    },
                    "matchServiceClass": "com.reltio.businesslogic.match.providers.internal.InternalMatchService"
                }

The following example illustrates how you can configure the Match Engine to the characters inside a match group using the transliteratorCommand" : "Any-Latin command in the cleanseAdapterParams section:

{
   "uri": "configuration/entityTypes/HCP/matchGroups/HCPbyTransLiterator",
                    "label": "HCP by transliterated Name",
                    "type": "suspect",
                    "rule": {
                        "and": {
                            "exact": [
                                "configuration/entityTypes/HCP/attributes/NonLatin_Name"
                            ],
                            "cleanse": [
                                {
                                    "cleanseAdapter": "com.reltio.cleanse.impl.TransliterateCleanser",
                                    "cleanseAdapterParams": {
                                        "transliteratorCommand" : "Any-Latin"
                                    },
                                    "mappings": [
                                        {
                                            "attribute": "configuration/entityTypes/HCP/attributes/NonLatin_Name",
                                            "cleanseAttribute": "configuration/entityTypes/HCP/attributes/NonLatin_Name"
                                        }
                                    ]
                                }
                            ]
                        },
                        "matchTokenClass": "com.reltio.match.token.ExactMatchToken"
                    },
                    "matchServiceClass": "com.reltio.businesslogic.match.providers.internal.InternalMatchService"
                }