Unify and manage your data

Scenario 3: Set up PrivateLink and configure the Snowflake (Direct Connect) Data Pipeline for Reltio BCE with a Snowflake account having a read-write backup account

Learn how to set up PrivateLink connectivity and configure two Snowflake (Direct Connect) Data Pipelines when your Reltio tenant is on Reltio Business Critical Edition (BCE) and you have a Snowflake account having a read-write backup account.

Scenario 3 applies when your Reltio tenant runs on Reltio Business Critical Edition (BCE) and you have a primary Snowflake Business Critical Edition account with a read-write backup Snowflake account.

In Scenario 3, two Snowflake (Direct Connect) Data Pipelines are configured in your Reltio primary environment. One pipeline writes to the primary Snowflake account, and the other writes to the backup Snowflake account. Only one pipeline must be active at a time depending on which Snowflake account is in write state.

Before you begin, confirm that your environment meets the conditions described in the PrivateLink connectivity requirements.

To set up PrivateLink and configure the Snowflake (Direct Connect) Data Pipeline for Reltio BCE with a Snowflake account having a read-write backup account, complete the following sections:
  1. Establish PrivateLink connectivity
  2. Enable Snowflake account-to-account replication
  3. Configure the primary Snowflake (Direct Connect) Data Pipeline
  4. Create a Snowflake failover group
  5. Configure the secondary Snowflake (Direct Connect) Data Pipeline
  6. Recover to the primary Snowflake account

Prerequisites

Before you begin, confirm that the following requirements are met.

RequirementDetails
Reltio BCEReltio Business Critical Edition (BCE) is enabled on your Reltio tenant.
Snowflake account-to-account replicationAccount-to-account replication is enabled on both Snowflake accounts. The procedure to enable replication is described in Enable Snowflake account-to-account replication.
Snowflake usersFour Snowflake users are available. Two users in the primary Snowflake account, one for the Reltio primary environment and one for the Reltio backup environment. Two users in the backup Snowflake account, one for each Reltio environment . All users hold the same role within their respective Snowflake account. Or you can create all the four users in the primary Snowflake account and have them replicated to the backup Snowflake account
Existing pipeline statusNo Snowflake (Direct Connect) Data Pipeline is currently active on your Reltio tenant. Deactivate any active pipeline before you begin.
Reltio environment namesThe environment names of your Reltio primary environment and your Reltio backup environment are available. Reltio provides these names when your tenant is provisioned.

Enable Snowflake account-to-account replication

Enable account-to-account replication on both Snowflake accounts so that the database created by the primary data pipeline replicates to the backup Snowflake account.

  1. In your primary Snowflake account, list the available replication accounts using the following command:
    SHOW REPLICATION ACCOUNTS;
  2. In your primary Snowflake account, enable account replication.

    Use the following command, replacing <organization_name> and <primary_account_name> with your values:

    SELECT SYSTEM$GLOBAL_ACCOUNT_SET_PARAMETER(
      '<organization_name>.<primary_account_name>',
      'ENABLE_ACCOUNT_DATABASE_REPLICATION',
      'true'
    );
  3. In your primary Snowflake account, confirm that replication is enabled using the following command:
    SHOW REPLICATION ACCOUNTS;
  4. Repeat steps 1 through 3 in your backup Snowflake account, replacing <primary_account_name> with <primary_account_name> in step 2, where you enable account replication.

Configure the primary Snowflake (Direct Connect) Data Pipeline

The primary data pipeline writes to the primary Snowflake account. Configure the primary data pipeline in the Console using the Snowflake (Direct Connect) Data Pipeline setup, with three additional settings for Scenario 3:

  • privateLinkEnabled set to true in the tenant's​​ physical configuration​ which enables routing the data pipeline over the PrivateLink network path.
  • Two Snowflake users in the primary Snowflake account to authenticate the Reltio primary environment and the Reltio backup environment.
  • The secrets is called twice, once for each Reltio tenant.

Complete the following steps to apply these settings.

  1. In the Console, follow step 1 through step 8 of the Snowflake (Direct Connect) Data Pipeline setup, using the connection details of the primary Snowflake account. These steps cover the connection details, Snowflake authentication, and adapter selection.
  2. In the tenant's physical configuration, set privateLinkEnabled to true.
  3. In the primary Snowflake account, set up the two Snowflake users required for the Reltio primary environment and the Reltio backup environment:
    1. Create two Snowflake users, one for the Reltio primary environment and one for the Reltio backup environment.
    2. Assign the same role to both users, and use that role for the data pipeline configuration.
  4. Generate a secret for each Reltio tenant by calling the secrets.

    Use the following endpoint to call the API:

    POST https://{env}-data-pipeline-hub.reltio.com/api/tenants/{tenantId}/adapters/{adapterName}/secrets

    The request body identifies the Snowflake user for which the secret is generated:

    {
      "SNOWFLAKE": {
        "username": "<snowflake_username>"
      }
    }

    Call the API twice:

    • First call: Run against the Reltio primary environment, using the Snowflake user created in the primary Snowflake account for the Reltio primary environment.
    • Second call: Run against the Reltio backup environment, using the Snowflake user created in the primary Snowflake account for the Reltio backup environment.
  5. In the primary Snowflake account, assign the public key returned by each secrets call to the corresponding Snowflake user.
  6. Create the Snowflake resources required by the primary data pipeline as described in step 9.2 of Snowflake (Direct Connect) Data Pipeline setup, which covers creating the internal stage, tables, tasks, and other Snowflake objects in the primary Snowflake account..
  7. Wait a few minutes for the Snowflake resources to provision.
  8. Validate the primary data pipeline for both Reltio environments by calling the validate API. Run the call once for the Reltio primary environment and once for the Reltio backup environment. Both calls return 200 OK when the configuration is valid.

    Use the following endpoint to call the API:

    POST https://{env}-data-pipeline-hub.reltio.com/api/tenants/{tenantId}/adapters/{adapterName}/validate

Create a Snowflake failover group

The Snowflake failover group manages which data objects from your primary Snowflake account are replicated to the backup Snowflake account at a defined replication schedule. Create the failover group after the primary data pipeline is configured, because the failover group references the database created by the primary data pipeline. Create it when your primary data pipeline configuration is complete, or return to this section after Configure the primary Snowflake (Direct Connect) Data Pipeline.

  1. In your primary Snowflake account, create the failover group:
    USE ROLE ACCOUNTADMIN;
    CREATE FAILOVER GROUP <failover_group_name>
      OBJECT_TYPES = DATABASES
      ALLOWED_DATABASES = <primary_database_name>
      ALLOWED_ACCOUNTS = <organization_name>.<backup_account_name>
      REPLICATION_SCHEDULE = '10 MINUTE';

    Replace the placeholders with your values:

    • <failover_group_name>: A name for the failover group.
    • <primary_database_name>: The database created by the primary data pipeline in the primary Snowflake account.
    • <organization_name> and <backup_account_name>: The Snowflake organization and the backup Snowflake account.
    • REPLICATION_SCHEDULE: The replication frequency. Set this value to match your recovery point objective.
  2. In your backup Snowflake account, create the failover group as a replica, using the same <failover_group_name> you used in step 1, where you created the failover group in the primary Snowflake account:
    USE ROLE ACCOUNTADMIN;
    CREATE FAILOVER GROUP <failover_group_name>
      AS REPLICA OF <organization_name>.<primary_account_name>.<failover_group_name>;
  3. Wait for the database and the other objects in the failover group to replicate to the backup Snowflake account before you proceed.

Configure the secondary Snowflake (Direct Connect) Data Pipeline

The secondary data pipeline writes to the backup Snowflake account. Configure the secondary data pipeline in the same Reltio primary tenant after the failover group is replicated to the backup Snowflake account.

Before you configure the secondary data pipeline, fail over the Snowflake primary account to the backup Snowflake account so that the backup account becomes the read-write primary. The secondary data pipeline writes to the backup account during this configuration.

  1. In the backup Snowflake account, fail over the failover group:
    ALTER FAILOVER GROUP <failover_group_name> PRIMARY;

    After the failover, the backup Snowflake account is the read-write primary, and the primary Snowflake account is read-only.

  2. In the Reltio Console, deactivate the primary data pipeline.
  3. Configure the secondary data pipeline for the backup Snowflake account in the Reltio Console:
    1. In the Console, follow steps 1 through 8 of the Snowflake (Direct Connect) Data Pipeline configuration, using the connection details of the backup Snowflake account.
    2. In the adapter physical configuration, set privateLinkEnabled to true.
  4. In the backup Snowflake account, set up the two Snowflake users required for the Reltio primary environment and the Reltio backup environment:
    1. Create two Snowflake users, one for the Reltio primary environment and one for the Reltio backup environment. Alternatively, create these users in the primary Snowflake account and replicate them to the backup Snowflake account, provided they are not the same users used by the primary data pipeline.
    2. Assign the same role to both users.
  5. Generate a secret for each Reltio tenant by calling the secrets, using the same endpoint and body shown in step 4 of Configure the primary Snowflake (Direct Connect) Data Pipeline, which covers generating secrets for the Reltio primary environment and the Reltio backup environment .

    Call the API twice, once for the Reltio primary environment and once for the Reltio backup environment, using the corresponding Snowflake user from the backup Snowflake account.

  6. In the backup Snowflake account, assign the public key returned by each secrets call to the corresponding Snowflake user.
    If you are replicating these users from the primary Snowflake account, then assign the public keys in your primary account to avoid them getting overwritten in the backup Snowflake account.
    Note: The recreate_resources is not required for the secondary data pipeline because the database and other resources are already replicated to the backup Snowflake account by the Snowflake failover group.
  7. Validate the secondary data pipeline for both Reltio environments by calling the validate. Run the call once for the Reltio primary environment and once for the Reltio backup environment. Both calls return 200 OK when the configuration is valid.
  8. In the backup Snowflake account, open the database and schema used by the secondary data pipeline, and check the status of the tasks in the schema.
    1. If any tasks are in the Suspended state, restart them by calling the recreate_resources. Run the call against the Reltio primary tenant with the following body:
      {
        "resources": ["tasks"],
        "force": true
      }

Recover to the primary Snowflake account

After the secondary data pipeline is validated, recover back from the backup Snowflake account to the primary Snowflake account so that the primary Snowflake account is the active read-write account during normal operation.

  1. In the primary Snowflake account, recover back the failover group:
    ALTER FAILOVER GROUP <failover_group_name> PRIMARY;

    After the recovery, the primary Snowflake account is the read-write primary again.

  2. In the Console, deactivate the secondary data pipeline and reactivate the primary data pipeline.
  3. Synchronize data that did not reach the primary Snowflake account during the failover-and-recovery cycle. Run syncToDataPipeline with the updatedSince parameter set to the EPOC time of midnight on the day of the failover. For more information, see the syncToDataPipeline API.

Result

After you complete the configuration, both Snowflake (Direct Connect) Data Pipelines are configured for PrivateLink connectivity in your Reltio primary environment. Confirm a successful setup against the following outcomes:

  • The primary data pipeline is configured to export data over the PrivateLink network path to the primary Snowflake account.
  • The secondary data pipeline is configured to export data over the PrivateLink network path to the backup Snowflake account.
  • Snowflake account-to-account replication and the Snowflake failover group keep the two Snowflake accounts in sync.
  • The primary data pipeline is active, and the secondary data pipeline is deactivated and ready for failover.

Validate the setup

After the configuration is complete, run a short validation pass to confirm that both data pipelines work as expected.

  1. Trigger an initial data sync from your Reltio primary tenant by running syncToDataPipeline. Trigger the sync from the Console or by calling the syncToDataPipeline API directly.
  2. In your primary Snowflake account, confirm that records appear in the entity, relationship, interaction, match, merge, activity, and workflow target tables created by the primary data pipeline.
  3. Confirm that account-to-account replication completes within the configured replication schedule.