Run your first data profiling job
Learn how to use the Profiler agent in AgentFlow to analyze a CSV file stored in Amazon S3.
Prerequisites
- The Profiler agent is enabled in your tenant.
- Cloud or remote storage access is configured to allow Reltio to retrieve your CSV file (for example, Amazon S3, Google Cloud Storage, Azure Blob Storage, or SFTP).
- You have the required access details for your storage provider. For example:
- AWS: Role ARN, External ID, and region
- Azure: Connection string or SAS token
- GCP: Service account credentials
- SFTP: Host, port, username, and authentication details
- You have the full path to the CSV file you want to profile.
Start profiling
- Open AgentFlow.
- Select Profiler.
- Provide the CSV file path and the corresponding access details required for your storage provider (for example, role ARN, external ID, and region for AWS S3).Tip: Set default credentials for the Profiler agent to avoid entering them each time.
- Go to
- Select Profiler from the drop-down list
- Under What personal preferences should AgentFlow consider in its response?, enter your default credentials
Figure 1. Settings dialog in AgentFlow with the Profiler agent selected. The Agent Instructions section includes a field for entering default AWS credentials, such as role ARN, external ID, and region. - Confirm schema.
- Run the analysis.
Result
A profiling workspace is created and column-level quality metrics are displayed.
Validation steps
- Confirm the job status becomes COMPLETED.
- Review quality metrics.