Unify and manage your data

Reltio Entity Resolution in Databricks set up

Learn how to request access to the solution, import the notebooks into your Databricks workspace, and prepare your environment for running Reltio Entity Resolution.

You must have:
  • Access to a Databricks workspace with Unity Catalog enabled
  • Permission to run notebooks and write to output tables in your workspace
  • Delta Tables containing Individual entity records
To complete the setup, follow these steps:
  1. Request access to the solution in Databricks Marketplace
    In the Reltio Entity Resolution in Databricks listing, click Get Access. Authenticate with your Databricks workspace and complete the request form.

    We'll review your request and contact you to proceed.

  2. Import the notebook into your Databricks workspace
    Once your request is approved, we'll provide you access to the repository containing ready-to-run notebooks. Use the Databricks Workspace UI to clone the repository and import the notebooks into your own workspace.
  3. Attach or create a cluster with Unity Catalog access
    Ensure that your notebook is attached to a compute cluster that:
    • Runs Databricks Runtime 16.4 or later
    • Is configured in Shared or Single-user mode (not legacy High Concurrency)
    • Has access to Unity Catalog resources
    We recommend using a GPU cluster (such as A10G) for better performance with embedding-based matching models.
  4. Verify Unity Catalog permissions
    Confirm that your user or service principal has the following privileges:
    Object TypeRequired Privilege
    CatalogUSE CATALOG
    SchemaUSE SCHEMA
    Delta TableSELECT
    If your cluster doesn't have access, contact your workspace administrator.
  5. Run the Unity Catalog validation cell
    In the notebook, locate and run the cell labeled Step 1: Cluster Configuration. This command checks whether your cluster is correctly configured. If successful, you'll see a confirmation message stating "Cluster Check Successful".
  6. Create the catalog containing machine learning models shared by us via Delta Sharing
    In your Databricks workspace, navigate to Catalog > Delta Sharing > Shared with me. Look for reltio as a provider. Clicking on it will showcase the data asset reltio_entity_resolution_models shared with you. Create a new Catalog using this asset. You will need this catalog while running step 3: Machine Learning Models Catalog Selection of the notebook.
Once setup is complete, your Databricks environment will be ready to run the entity resolution notebook using Reltio’s ML-based resolution pipeline. See Configure and run entity resolution in Databricks.