Migrate Snowflake pipeline from legacy to internal stage architecture
Learn how to transition your existing Snowflake pipeline using external stage and Snowpipe ingestion to the internal stage–based architecture.
You must have:
- An existing Reltio pipeline set up using Snowpipe and external cloud storage
- A Snowflake account with administrative access
- Reltio Data Pipeline for Snowflake add-on enabled
- Create a new Snowflake user and role for the internal stage pipeline.
- Create the Snowflake user or choose an existing one.You do not need to share a password. Authentication will use a public-private RSA key pair.
- Create a new role for the internal stage pipeline.
- Create the Snowflake user or choose an existing one.
- Create Snowflake internal objects used by the new pipeline.
- Create an internal stage.
CREATE OR REPLACE STAGE <new_internal_stage> DIRECTORY = (ENABLE = TRUE);
- Create a file format.
CREATE OR REPLACE FILE FORMAT zippedJson TYPE = 'JSON' COMPRESSION = GZIP STRIP_OUTER_ARRAY = TRUE;
- Create a stream on the internal stage.
CREATE OR REPLACE STREAM <new_stream> ON STAGE <new_internal_stage>;
- Create a task to copy from the internal stage to the landing table.
CREATE OR REPLACE TASK <schema>.NEW_LOAD_NEW_FILE_DATA WAREHOUSE = <warehouse> SCHEDULE = '1 MINUTE' WHEN SYSTEM$STREAM_HAS_DATA('<new_stream>') AS COPY INTO <schema>."<landing_table>" FROM ( SELECT $1:uri::STRING, $1:version::STRING, $1:timestamp::STRING, $1:deleted::BOOLEAN, $1:linked::BOOLEAN, $1:objectType::STRING, $1:type::STRING, $1, CURRENT_TIMESTAMP() FROM @<new_internal_stage> ) FILE_FORMAT = zippedJson PURGE = TRUE;
- Resume the task.
ALTER TASK <schema>.NEW_LOAD_NEW_FILE_DATA RESUME;
- Create an internal stage.
- Grant the required permissions to the new role.
GRANT USAGE ON WAREHOUSE <warehouse> TO ROLE <role>; GRANT USAGE ON DATABASE <database> TO ROLE <role>; GRANT USAGE ON SCHEMA <database>.<schema> TO ROLE <role>; GRANT READ, WRITE ON ALL STAGES IN SCHEMA <database>.<schema> TO ROLE <role>; GRANT EXECUTE TASK ON ACCOUNT TO ROLE <role>; GRANT SELECT ON FUTURE TABLES IN SCHEMA <database>.<schema> TO ROLE <role>; GRANT INSERT ON FUTURE TABLES IN SCHEMA <database>.<schema> TO ROLE <role>; GRANT ROLE <role> TO USER <username>;
- Create the pipeline with a Support request.Enter a Support case with these details for our team to instantiate the pipeline:
- Account
- Warehouse
- Internal Stage name
- Role name
- Pipeline name - A name to identify the pipeline instance and use on API calls to update and manage the pipeline (
adapterName
). The name must be alphanumeric, can be 3 to 20 characters in length, and cannot contain any spaces. - Data delivery options:
- Attribute format - choose one format for how attribute data is exported into Snowflake:
- STANDARD: Stores the full attribute structure, including metadata like
id
,isOv
,pin
,ignore
, anduri
. - FLATTEN: Stores only attribute values. Example:
{"FirstName": ["Jon", "Doe"]}
- FLATTEN_SINGLE_VALUE: Stores only the first (single) attribute value. Example:
{"FirstName": "Jon"}
- STANDARD: Stores the full attribute structure, including metadata like
- Data filtering – enable data-level filters for the adapter.
- Transmit OV values only – include only operational values in the data export.
- Serialize initial sources in Crosswalks – preserve initial source information in the exported crosswalk data.
- Attribute format - choose one format for how attribute data is exported into Snowflake:
- These we'll take from the current pipeline:
- Organization
- Database
- Schema
When you receive confirmation that we've created the pipeline instance and confirmed the name, proceed to the next step. - Call the Secrets API to register your Snowflake username with Reltio.
POST <hub-url>/api/tenants/<tenantID>/adapters/<adapterName>/secrets { "SNOWFLAKE": { "username": "<snowflake_username>" } }
To construct the
hub-url
, use the format{reltio-environment}-data-pipeline-hub.reltio.com
. For example, if your Reltio environment istest
, your hub URL will betest-data-pipeline-hub.reltio.com
.Use the publicKey returned in the response in the next step. - Assign the public key to the Snowflake user.
ALTER USER <snowflake_username> SET RSA_PUBLIC_KEY = '<Returned_Public_Key>';
- Validate the new adapter configuration.
POST <hub-url>/api/tenants/<tenantId>/adapters/<adapterName>/validate
To construct the
hub-url
, use the format{reltio-environment}-data-pipeline-hub.reltio.com
. For example, if your Reltio environment istest
, your hub URL will betest-data-pipeline-hub.reltio.com
.Successful validation returns200 OK
. - Disable the legacy pipeline in the Console.Important: You MUST disable the legacy pipeline to prevent any conflicts or redundancy.In Console > Data Pipelines slide the Status switch left to disable the data pipeline. For more details, see Manage existing data pipelines.
- Optionally, remove the legacy components and pipeline from Snowflake and Reltio.
- Drop the Snowpipe.
DROP PIPE <old_pipe_name>;
- Drop the external stage.
DROP STAGE <external_stage_name>;
- Drop the notification integration.
DROP NOTIFICATION INTEGRATION <integration_name>;
- Delete the legacy pipeline instance.In the Console > Data Pipelines, select Delete for your legacy pipeline. For more details, see Recreate tables/views, re-sync data, or delete a pipeline.Alternatively, call this API operation with an empty request body:
POST <hub-url>/api/tenants/<tenantId>/adapters/<adapterName>/delete_pipeline
To construct the
hub-url
, use the format{reltio-environment}-data-pipeline-hub.reltio.com
. For example, if your Reltio environment istest
, your hub URL will betest-data-pipeline-hub.reltio.com
.
- Drop the Snowpipe.
- Trigger a full sync to push historical data to the new pipeline.
To construct thePOST <reltio-url>/reltio/api/<tenantId>/syncToDataPipeline
reltio-url
, use the format{reltio-environment}.reltio.com
. For example, if your Reltio environment is test, your hub URL will betest.reltio.com
This syncs all entities, relations, interactions, potential matches, and merges to the new pipeline adapter.