Install the required library on the cluster
Learn how to install the library shared by Reltio on your Databricks compute cluster so that you can run the entity resolution pipeline.
After you accept and mount the shared assets, install the shared library on your compute cluster to enable pipeline execution.
Prerequisites
Before you begin, ensure that you have completed the following steps:
Steps to install the required library on the cluster
To install the required library on your cluster, follow these steps:
- In the Databricks workspace, click Compute in the left sidebar.
- Select the cluster that you prepared for entity resolution in the step Accept and mount the shared assets. This opens the cluster details page.
- In the cluster details page, go to the Libraries tab and click Install new.
- In the Install library dialog:
- Select Volumes as the source
- Browse to the
.whlfile in the shared catalog that you mounted earlier in the step Accept and mount the shared assets, for example:<delta_shared_catalog_name> → backend → wheel → <file>.whl.
- Click on Install and wait for the library installation to complete.
- Restart the cluster to ensure that the library is loaded into the environment.
Verification
Verify that the library is installed successfully:
- The library appears in the Libraries tab of the cluster.
- The status of the library shows as Installed.
- The cluster restarts successfully without errors.
Result
The shared library is installed on the compute cluster. You can now use this cluster to run the entity resolution pipeline notebook.
Your Databricks workspace is ready for Reltio Embedded Entity Resolution in Databricks. You can now Run Reltio Embedded Entity Resolution in Databricks using the configured cluster and notebook.