Accelerate the Value of Data

Merging of Attributes Based on Lookup Canonical Form

Merge lookup attributes based on canonical lookup codes.

Overview

Reltio has introduced a new feature which allows merging of lookup attributes based on canonical (standardized) lookup codes instead of on raw lookup values. To enable this feature, you must set the resolveLookupCode property to true in a tenant physical configuration.

Consider the case of nested attributes of a communication type (for example, Cell Phone) and a phone number (for example, 9999). When data comes from different sources, attributes need to be merged/deduplicated. For example, “[CELL, 9999]” from one source and “[CELL PHONE, 9999]” from another source should be deduplicated, as, conceptually, there is only one communication type and phone number.

Current Behavior

Attributes in the business model configuration can be specified to use Reference Data Management (RDM) lookup tables to map raw values to canonical codes and values. This results in canonical codes and values added without any existing data being lost.

Original

[raw, phone#]

After RDM lookup

[raw, canonical code, canonical value, phone#]

Final result

[raw, canonical code, canonical value, phone#]

[CELL, 9999] [CELL, MOBILE, CELL PHONE, 9999]

[CELL, MOBILE, CELL PHONE, 9999]

[CELL PHONE, MOBILE, CELL PHONE, 9999]

[CELL PHONE, 9999] [CELL PHONE, MOBILE, CELL PHONE, 9999]

New Behavior

Since the canonical codes and values are the same the two attributes above should be merged. That is, the result should be as follows:

Original

[raw, phone#]

After RDM lookup

canonical code, canonical value, phone#]

Final result

[canonical code, canonical value, phone#]

[CELL, 9999] [MOBILE, CELL PHONE , 9999] [MOBILE, CELL PHONE , 9999]
[CELL PHONE, 9999] [MOBILE, CELL PHONE , 9999]

The two attributes after RDM lookup should be deduplicated, as they are identical after mapping to the canonical code and value.

To achieve the desired outcome, mapping of raw values to the canonical codes is performed when the data is loaded. Only the canonical code is stored and used in the merge process. The canonical code is used to look up the canonical value when data is retrieved.

Important Considerations for Using New Lookup Code Feature

For any new features resulting in changes to existing system behavior, Reltio recommends that you incorporate test cases into your release plan to verify the new feature works inline with your expectations.

Please read the following key points before enabling merging of attributes based on canonical lookup codes as rollback of the merged data can be difficult.

  • Merging attributes based on lookup code changes the behavior of earlier releases, where merging was based on lookup raw values. Enabling this feature means canonical lookup codes are preserved instead of raw values. This code is used to look up the canonical value every time data is retrieved.
  • Once the lookup code configuration is enabled (resolveLookupCode=true), it is not possible to revert the merged data back to its original state without reloading your profile data.
    Note: You can reset the resolveLookupCode property to false to revert back to the default behavior. This change is only applied to new profiles that are loaded after the property change. Existing merged data is not reverted back to its original state.
  • You should incorporate running Remove Attribute Duplicates Task in your testing to apply the new behavior to your existing tenant data. This task removes any duplicate nested attribute values based on the matchFieldURIs defined in your L3 configuration.
    Attention: The task is to be run by a tenant administrator.
  • Perform additional test cases to verify the desired outcome (for example, merging and unmerging entities with lookup based nested attributes, Cumulative Entity Update of new rows for nested attributes, etc.).
  • Your testing verification should include checking the results of the following components:
    • Merging of simple and nested attributes
    • Generation of surrogate crosswalks
    • Search, scan, and export filtering
    • Metadata security filtering
    • Messaging object filtering
    • Cleansing
    • Matching

Once you have completed all of the above and satisfied with the outcome, you are ready to promote the feature (set resolveLookupCode to true) on your production tenant.