Accelerate the Value of Data

Clean up the Reltio Data Pipeline for Databricks

Learn how to clean up the Databricks pipeline and associated data.

Cleaning up a pipeline frees up storage and helps with system performance. Consider creating backups of any essential data before performing these steps.
  1. Drop Catalog
    Note: If you are performing a cleanup as part of tenant truncation, this step is optional.
    To permanently remove data from Databricks, drop the Databricks Catalog with this command:
    
    DROP CATALOG catalog_name CASCADE
            
  2. Delete Databricks Delta Live pipeline.
    In your Databricks workspace under Workflows in the Delta Live Tables tab, stop the pipeline, if it is running, and delete it.
  3. Clean Staging and Target containers/buckets.
    In your cloud storage service, remove everything from the Staging and Target containers/buckets that are configured for the Databricks data pipeline.