Datasets for the GBQ data schema
Learn about the difference in the legacy and new GBQ datasets.
JSON-based
Legacy column-based.
For information on selecting a configuration, see topic Configure a new GBQ pipeline in the Console. For details of these different data schema types, see the section below.
JSON-based GBQ data schema
We recommend the JSON-based schema. It has only one column for the crosswalk Uri, crosswalk attributes, and crosswalk updateDate values, and is the same for attributes and analytical attributes.
Legacy column-based GBQ data schema
Optionally, you can use the legacy column-based GBQ data schema. It explodes each section of the crosswalk data into multiple columns, which creates separate columns for the crosswalk. For example: crosswalk.uri, crosswalk.attributes, and crosswalk.updateDate.
If the Reltio attribute name satisfies the GBQ schema naming convention, then it remains as-is.
If the Reltio attribute name contains invalid GBQ characters, then it is replaced with an underscore (_).
If the Reltio attribute name starts with a number, then it is prefixed with an underscore (_) at the beginning of the name.
If the Reltio attribute meets the naming collisions after the previous two steps, then include a CRC32 hash function value at the end of the Reltio attribute name.
If there are two or more Reltio attribute with the same name, then a hash function value is added at the end of the Reltio attribute name. For example, there are two attribute names: Another-Field and another_field. After the transformation rule is applied, the GBQ field names changes to Another_Field_71e1316e and another_field_45b4f955 respectively.