Unify and manage your data

Datasets for the GBQ data schema

Learn about the differences between the JSON-based and legacy column-based GBQ data schemas in your tenant.

Your tenant comes bundled with one dataset in GBQ. The data schema can be of two types:
  • JSON-based
  • Legacy column-based.

For information on selecting a configuration, see Configure a new GBQ pipeline in the Console.

For details of these different data schema types, see the section below.

JSON-based GBQ data schema

We recommend the JSON-based schema. It has only one column for the crosswalk URI, crosswalk attributes, and crosswalk updateDate values, and is the same for attributes and analytical attributes.

Legacy column-based GBQ data schema

Optionally, you can use the legacy column-based GBQ data schema. It explodes each section of the crosswalk data into multiple columns, which creates separate columns for the crosswalk. For example: crosswalk.uri, crosswalk.attributes, and crosswalk.updateDate.

This column-based data schema uses automatic transformation rules to convert Reltio attribute names to GBQ field names. This is because GBQ field names can only contain letters, numbers, or underscore (_), and must also start with a letter or an underscore. The following match rules are used:
  • If the Reltio attribute name satisfies the GBQ schema naming convention, then it remains as-is.
  • If the Reltio attribute name contains invalid GBQ characters, then it is replaced with an underscore (_).
  • If the Reltio attribute name starts with a number, then it is prefixed with an underscore (_) at the beginning of the name.
  • If the Reltio attribute meets the naming collisions after the previous two steps, then include a CRC32 hash function value at the end of the Reltio attribute name.
  • If there are two or more Reltio attribute with the same name, then a hash function value is added at the end of the Reltio attribute name. For example, there are two attribute names: Another-Field and another_field. After the transformation rule is applied, the GBQ field names changes to Another_Field_71e1316e and another_field_45b4f955 respectively.
Note: If there is a change in the data schema, these parameters are not impacted.