Accelerate the Value of Data

Datasets for the GBQ data schema

Learn about the difference in the legacy and new GBQ datasets.

Your tenant comes bundled with one dataset in GBQ. The data schema can be of two types:
  • JSON-based

  • Legacy column-based.

For information on selecting a configuration, see topicSubmit a provisioning request to Reltio Support - SUPERSEDED. For details of these different data schema types, see the section below.

JSON-based GBQ data schema

We recommend the JSON-based schema. It has only one column for the crosswalk Uri, crosswalk attributes, and crosswalk updateDate values, and is the same for attributes and analytical attributes.

Legacy column-based GBQ data schema

Optionally, you can use the legacy column-based GBQ data schema. It explodes each section of the crosswalk data into multiple columns, which creates separate columns for the crosswalk. For example: crosswalk.uri, crosswalk.attributes, and crosswalk.updateDate.

This column-based data schema uses automatic transformation rules to convert Reltio attribute names to GBQ field names. This is because GBQ field names can only contain letters, numbers, or underscore (_), and must also start with a letter or an underscore. The following match rules are used:
  • If the Reltio attribute name satisfies the GBQ schema naming convention, then it remains as-is.

  • If the Reltio attribute name contains invalid GBQ characters, then it is replaced with an underscore (_).

  • If the Reltio attribute name starts with a number, then it is prefixed with an underscore (_) at the beginning of the name.

  • If the Reltio attribute meets the naming collisions after the previous two steps, then include a CRC32 hash function value at the end of the Reltio attribute name.

  • If there are two or more Reltio attribute with the same name, then a hash function value is added at the end of the Reltio attribute name. For example, there are two attribute names: Another-Field and another_field. After the transformation rule is applied, the GBQ field names changes to Another_Field_71e1316e and another_field_45b4f955 respectively.

Note: If there is a change in the data schema, these parameters are not impacted.