Unify and manage your data

Migrate Snowflake pipeline from legacy to internal stage architecture

Learn how to transition your existing Snowflake pipeline using external stage and Snowpipe ingestion to the internal stage–based architecture.

You must have:
  • An existing Reltio pipeline set up using Snowpipe and external cloud storage
  • A Snowflake account with administrative access
  • Reltio Data Pipeline for Snowflake add-on enabled
  1. Create a new Snowflake user and role for the internal stage pipeline.
    1. Create the Snowflake user or choose an existing one.
      You do not need to share a password. Authentication will use a public-private RSA key pair.
    2. Create a new role for the internal stage pipeline.
  2. Create Snowflake internal objects used by the new pipeline.
    1. Create an internal stage.
      
      CREATE OR REPLACE STAGE <new_internal_stage>
      DIRECTORY = (ENABLE = TRUE);
                  
    2. Create a file format.
      
      CREATE OR REPLACE FILE FORMAT zippedJson
      TYPE = 'JSON'
      COMPRESSION = GZIP
      STRIP_OUTER_ARRAY = TRUE;
                  
    3. Create a stream on the internal stage.
      
      CREATE OR REPLACE STREAM <new_stream> ON STAGE <new_internal_stage>;
                  
    4. Create a task to copy from the internal stage to the landing table.
      
      CREATE OR REPLACE TASK <schema>.NEW_LOAD_NEW_FILE_DATA
      WAREHOUSE = <warehouse>
      SCHEDULE = '1 MINUTE'
      WHEN SYSTEM$STREAM_HAS_DATA('<new_stream>')
      AS
      COPY INTO <schema>."<landing_table>"
      FROM (
        SELECT
          $1:uri::STRING,
          $1:version::STRING,
          $1:timestamp::STRING,
          $1:deleted::BOOLEAN,
          $1:linked::BOOLEAN,
          $1:objectType::STRING,
          $1:type::STRING,
          $1,
          CURRENT_TIMESTAMP()
        FROM @<new_internal_stage>
      )
      FILE_FORMAT = zippedJson
      PURGE = TRUE;
                  
    5. Resume the task.
      
      ALTER TASK <schema>.NEW_LOAD_NEW_FILE_DATA RESUME;
      
  3. Grant the required permissions to the new role.
    
    GRANT USAGE ON WAREHOUSE <warehouse> TO ROLE <role>;
    GRANT USAGE ON DATABASE <database> TO ROLE <role>;
    GRANT USAGE ON SCHEMA <database>.<schema> TO ROLE <role>;
    GRANT READ, WRITE ON ALL STAGES IN SCHEMA <database>.<schema> TO ROLE <role>;
    GRANT EXECUTE TASK ON ACCOUNT TO ROLE <role>;
    GRANT SELECT ON FUTURE TABLES IN SCHEMA <database>.<schema> TO ROLE <role>;
    GRANT INSERT ON FUTURE TABLES IN SCHEMA <database>.<schema> TO ROLE <role>;
    GRANT ROLE <role> TO USER <username>;
            
  4. Create the pipeline with a Support request.
    Enter a Support case with these details for our team to instantiate the pipeline:
    • Account
    • Warehouse
    • Internal Stage name
    • Role name
    • Pipeline name - A name to identify the pipeline instance and use on API calls to update and manage the pipeline (adapterName). The name must be alphanumeric, can be 3 to 20 characters in length, and cannot contain any spaces.
    • Data delivery options:
      • Attribute format - choose one format for how attribute data is exported into Snowflake:
        • STANDARD: Stores the full attribute structure, including metadata like id, isOv, pin, ignore, and uri.
        • FLATTEN: Stores only attribute values. Example: {"FirstName": ["Jon", "Doe"]}
        • FLATTEN_SINGLE_VALUE: Stores only the first (single) attribute value. Example: {"FirstName": "Jon"}
      • Data filtering – enable data-level filters for the adapter.
      • Transmit OV values only – include only operational values in the data export.
      • Serialize initial sources in Crosswalks – preserve initial source information in the exported crosswalk data.
    • These we'll take from the current pipeline:
      • Organization
      • Database
      • Schema
    When you receive confirmation that we've created the pipeline instance and confirmed the name, proceed to the next step.
  5. Call the Secrets API to register your Snowflake username with Reltio.
    
    POST <hub-url>/api/tenants/<tenantID>/adapters/<adapterName>/secrets
    
    {
      "SNOWFLAKE": {
        "username": "<snowflake_username>"
      }
    }
            

    To construct the hub-url, use the format {reltio-environment}-data-pipeline-hub.reltio.com. For example, if your Reltio environment is test, your hub URL will be test-data-pipeline-hub.reltio.com.

    Use the publicKey returned in the response in the next step.
  6. Assign the public key to the Snowflake user.
    
    ALTER USER <snowflake_username>
    SET RSA_PUBLIC_KEY = '<Returned_Public_Key>';
            
  7. Validate the new adapter configuration.
    
    POST <hub-url>/api/tenants/<tenantId>/adapters/<adapterName>/validate
            

    To construct the hub-url, use the format {reltio-environment}-data-pipeline-hub.reltio.com. For example, if your Reltio environment is test, your hub URL will be test-data-pipeline-hub.reltio.com.

    Successful validation returns 200 OK.
  8. Disable the legacy pipeline in the Console.
    Important: You MUST disable the legacy pipeline to prevent any conflicts or redundancy.
    In Console > Data Pipelines slide the Status switch left to disable the data pipeline. For more details, see Manage existing data pipelines.
  9. Optionally, remove the legacy components and pipeline from Snowflake and Reltio.
    1. Drop the Snowpipe.
      DROP PIPE <old_pipe_name>;
    2. Drop the external stage.
      DROP STAGE <external_stage_name>;
    3. Drop the notification integration.
      DROP NOTIFICATION INTEGRATION <integration_name>;
    4. Delete the legacy pipeline instance.
      In the Console > Data Pipelines, select Delete for your legacy pipeline. For more details, see Recreate tables/views, re-sync data, or delete a pipeline.
      Alternatively, call this API operation with an empty request body:
      POST <hub-url>/api/tenants/<tenantId>/adapters/<adapterName>/delete_pipeline

      To construct the hub-url, use the format {reltio-environment}-data-pipeline-hub.reltio.com. For example, if your Reltio environment is test, your hub URL will be test-data-pipeline-hub.reltio.com.

  10. Trigger a full sync to push historical data to the new pipeline.
    
    POST <reltio-url>/reltio/api/<tenantId>/syncToDataPipeline
            
    To construct the reltio-url, use the format {reltio-environment}.reltio.com. For example, if your Reltio environment is test, your hub URL will be test.reltio.com
    This syncs all entities, relations, interactions, potential matches, and merges to the new pipeline adapter.
After migration, your Reltio events stream into Snowflake using internal stage ingestion. The legacy Snowpipe-based configuration is fully decommissioned.