Overview of Google BigQuery (GBQ) Schema

In GBQ all the datasets are grouped in the customer-facing project. You can use the tenant ID to search for the datasets.

The following two datasets are available in the GBQ:
  1. GBQ Tables: GBQ tables include datasets for each Entity type, Relation type and Interaction type. Additional tables in the GBQ contain information about matches and merges. The GBQ tables follow the naming convention riq_dw_<environment>_<tenant ID> and stores raw entity, relationship, interaction, match and, merge MDM data.
  2. GBQ Views: GBQ views include the current version of an entity, relationship, match or workflow. The GBQ views follow the naming convention views_riq_dw_<environment>_<tenant ID> and contains views of the tables dataset.
There are two ways an MDM tenant and GBQ are synced:
  1. Batch: An on-demand process where the complete data is synced.
  2. Streaming: In the MDM tenant, changes are copied to the GBQ in near real-time. Typical operating mode for Reltio Reporting and Analytics is streaming.
Note: GBQ queries must always use GBQ views instead of tables.

Limitations

The following limitations still exist:

  • For a pair of entities having more than one potential match, if one of the matches is removed the entity is skipped in GBQ. For example, when a rule is updated or removed.
  • For a pair of entities having a manual match and a potential match, only the manual match is returned in GBQ. In the UI, the count is not affected because a manual match is displayed as a manual rule within the potential match record.
  • For a pair of entities having more than one potential match, only the latest are returned in GBQ. It does not affect the count. For example: when a rule is updated or duplicated.
  • In Reltio Insights, transitive matches are not supported.