Configure the Reltio Data Pipeline for Databricks for AWS
Learn how to configure Databricks to receive data from your Reltio tenant in the AWS cloud.
Ready to move your data with your Reltio Data Pipeline for Databricks? Configure the pipeline to keep your Delta Lake tables and views in sync with your Reltio data model.
- Configure Databricks pipeline for AWS using Console UI - simpler UI-based configuration with automated steps.
- Configure Databricks pipeline for AWS using APIs - API-based configuration with many manual steps.
Before you start
Before you start configuring the Reltio for Reltio Data Pipeline for Databricks, ensure you have the necessary permissions and information at hand. You may find it helpful to consult this page for easy reference.
Prerequisite | Required information | Your details |
---|---|---|
Configure AWS cloud storage for Databricks | ||
The service requires the object storage to be publicly accessible over the internet. | ||
Storage account management permissions | You are an AWS administrator OR Ask your AWS administrator to perform these tasks | |
Integrate AWS cloud storage with Databricks | ||
Storage account management permissions | You are an AWS administrator OR Ask your AWS administrator to perform these tasks | |
Databricks account administrator permissions | You've been assigned these roles:
OR You've been assigned a role that contains these roles OR Ask your Databricks administrator to perform these tasks | |
Databricks Unity Catalog (when used) |
| |
Configure the Reltio Data Pipeline for Databricks | ||
Reltio tenant | Tenant Environment Name | |
Tenant ID | ||
Support request | Reltio Data Pipeline configuration request for Databricks | |
Validate and sync with the Reltio Data Pipeline for Databricks for AWS | ||
Reltio administrator permissions | You have one of these roles:
OR Ask your Reltio administrator to perform these tasks. |
Take note
As you work through the configuration, you'll want to make a note of some values you'll need in later steps and stages. You may find it helpful to download and make a copy of this page and record your information as you go along.
Stage/section | Entry field | Your details |
---|---|---|
Determine mode for running pipeline | Mode of Delta Live Tables pipeline | |
Configure AWS cloud storage for Databricks | ||
Create an AWS S3 storage bucket for Staging with a lifecycle rule. | Staging S3 ARN | |
Create an AWS S3 storage bucket for Target | Target/Table S3 ARN | |
DPH service role to access Staging bucket Create DPH service user IAM role with an external ID in AWS | DPH service user Role ARNfor Staging bucket | |
Configure Event Notification for Staging bucket [Only Required if using File Notification Mode] | ||
Create queue for event notifications, see Set up File Notification mode in AWS | Queue ARN | |
Permission setup for Databricks | ||
Databricks host URL | Databricks URL | |
Generate an access token, see Manage service principals | Service Principal Token |