Run your first data profiling job
Learn how to use the Profiler agent in AgentFlow to analyze a CSV file stored in cloud or remote storage.
Prerequisites
- The Profiler agent is enabled in your tenant.
- Cloud or remote storage access is configured to allow Reltio to retrieve your CSV file (for example, Amazon S3, Google Cloud Storage, Azure Blob Storage, or SFTP).
- You have the required access details for your storage provider. For example:
- AWS: Role ARN, External ID, and region
- Azure: Connection string or SAS token
- GCP: Service account credentials
- SFTP: Host, port, username, and authentication details
- You have the full path to the CSV file you want to profile.
Start profiling
- Log in to AgentFlow.
- Select Profiler.
- Provide the CSV file path and the corresponding access details required for your storage provider (for example, role ARN, external ID, and region for AWS S3).Tip: Set default credentials for the Profiler agent to avoid entering them each time.
- Go to
- Select Profiler from the drop-down list
- Under What personal preferences should AgentFlow consider in its response?, enter your default credentials
Figure 1. Settings dialog in AgentFlow with the Profiler agent selected - Confirm schema.
- Run the analysis.
Result
A profiling workspace is created and column-level quality metrics are displayed.
Validation steps
- Confirm the job status becomes COMPLETED.
- Review quality metrics.
Avoid entering credentials each time
Watch the following video to learn how to save default credentials for the Profiler agent in AgentFlow. After you save these credentials, the agent reuses them for future profiling jobs so that you do not need to enter the same access details each time.