Store export results
Learn how to store the results from export APIs.
Configured storage method
The Export Service must have at least one storage methods (Amazon S3 storage, Google Cloud Storage (GCS), or Azure Blob Storage) configured. The storage type for the Reltio bucket is based on the storage type defined in the tenant configuration.
{
...
"exportConfig": {
"storageType": "s3"
},
...
}
Your storageType
value can be S3
, GCS
, or Azure
. If the storageType
property isn’t specified, then by default S3
is considered as the storage type. Note that this configuration is related to a Reltio bucket only.
- S3_SERVICE_ERROR(1102)
- GCS_SERVICE_ERROR(1112)
- AZURE_STORAGE_SERVICE_ERROR(1122)
- The reltio bucket 'SPECIFIED_BUCKET' doesn't exist
Paths to store the export result
- Auto-generated path – The Export Service automatically generates the path. The path is built from contextual information including data type, username, current date and time, and the tenant name. For example,
/entities/john.doe/2017/06-Oct-2017/tenant_18-17_entities_3.zip
. You can use this path to export from the Hub. - User-defined path – In a user-provided path, the parent folder of this path must be defined in the tenant configuration. You can use this to store the export results into a custom path. Specify the custom path in the export task parameters. Depending on the storage, the parameter name is either
s3Path
,gcsPath
, orazureStoragePath
.
Preparing the storage bucket
- To export data to a custom S3 destination, prepare the S3 Bucket and get the access keys (AWS access key ID and AWS secret key) of the AWS user who owns or has access to the bucket.
- To export data to a custom GCS destination, prepare the GCS Bucket and get GCP credentials of the GCP user who owns or has access to the bucket.
- To export data to a custom Azure destination, prepare an Azure container and obtain the Azure account name and account key.
Specifying the storage destinations
HTTP Header | Query Parameter | Description | Required | |
---|---|---|---|---|
S3 | s3Bucket | s3Bucket | Indicates the S3 bucket. | No |
s3Region | s3Region | Indicates the S3 region. Specify this parameter to bypass region autodetection. | No Note: If a job failure occurs, and an S3 Region is specified using this parameter, ensure that post-processing is disabled in the export job definition. | |
s3Path | s3Path | Indicates the S3 path. Specifying the leading slash is optional. | Yes, when the s3Bucket parameter is specified. | |
s3Sse | s3Sse | Indicates the server-side encryption algorithm (AES256 or aws:kms ). The parameter is ignored if the s3Bucket parameter isn’t specified (export to Reltio bucket). | No | |
s3SseKmsKey | s3SseKmsKey | Indicates the custom KMS key for server-side encryption. | No | |
awsAccessKey | awsAccessKey | Indicates the AWS access key ID. The account must have Write permissions for the S3 destination. The AWS access key ID must be URL-encoded if it’s passed by the query parameter. | Yes, when the s3Bucket parameter is specified and the roleArn parameter isn’t specified. | |
awsSecretKey | awsSecretKey | Indicates the AWS secret key. It must be URL-encoded if it’s passed by the query parameter. | Yes, when the s3Bucket parameter is specified and the roleArn parameter isn’t specified. | |
roleArn | roleArn | Indicates the AWS role for cross-account secured export. | Yes, if the s3Bucket parameter is specified and the awsAccessKey or awsSecretKey parameters aren’t specified. | |
externalId | externalId | Indicates the customer’s unique identifier for granular control over role access. For more information, see How to use an external ID? | No | |
GCS | gcsBucket | gcsBucket | Indicates the GCS bucket. | No |
gcsPath | gcsPath | Indicates the GCS path. Specifying the leading slash is optional. | Yes, when the gcsBucket parameter is specified. | |
gcpCredentials | gcpCredentials | Indicates the GCP credentials JSON. The account must have Write permissions on the GCS destination. It must be URL-encoded if it’s passed by the query parameter. | Yes, when the gcsBucket parameter is specified. | |
Azure | azureStorageContainer | azureStorageContainer | Indicates the Azure container. | No |
azureStoragePath | azureStoragePath | Indicates the Azure path. Specifying the leading slash is optional. | Yes, when the azureStorageContainer parameter is specified. | |
azureStorageAccountName | azureStorageAccountName | Indicates the Azure account name. The account must have Write permissions on the Azure destination. | Yes, when the azureStorageContainer parameter is specified. | |
azureStorageAccountKey | azureStorageAccountKey | Indicates the Azure account key. It must be URL-encoded if it’s passed by the query parameter. | Yes, when the azureStorageContainer parameter is specified. |
Example 1 (HTTP headers) S3
POST /export/{tenant}/entities
Authorization Bearer {token}
Content-Type application/json
awsAccessKey ...
awsSecretKey ...
s3Bucket reltio-data-exports
s3Path /entities/john.doe/2020.07.01
s3Region cn-north-1
{}
Example 2 (query parameters) S3
POST /export/{tenant}/entities?awsAccessKey=...&awsSecretKey=...&s3Bucket=reltio-data-exports&s3Path=/entities/john.doe/2020.07.01
Authorization Bearer {token}
Content-Type application/json
{}
Example 3 (HTTP headers) S3
POST /export/{tenant}/entities
Authorization Bearer {token}
Content-Type application/json
awsAccessKey ...
awsSecretKey ...
s3Bucket reltio-data-exports
s3Path /entities/john.doe/2020.07.01
s3Sse aws:kms
s3SseKmsKey arn:aws:kms:us-east-1:0:key/1
Example 4 (HTTP headers) GCS
POST /export/{tenant}/entities
Authorization Bearer {token}
Content-Type application/json
gcpCredentials ...
gcsBucket reltio-data-exports
gcsPath /entities/john.doe/2020.07.01
{}
Example 5 (query parameters) GCS
POST /export/{tenant}/entities?gcpCredentials=...&gcsBucket=reltio-data-exports&gcsPath=/entities/john.doe/2020.07.01
Authorization Bearer {token}
Content-Type application/json
Example 6 (HTTP headers) Azure
POST /export/{tenant}/entities
Authorization Bearer {token}
Content-Type application/json
azureStorageAccountName ...
azureStorageAccountKey ...
azureStorageContainer reltio-data-exports
azureStoragePath /entities/john.doe/2020.07.01
{}
Example 7 (query parameters) Azure
POST /export/{tenant}/entities?azureStorageAccountName=...&azureStorageAccountKey=...&azureStorageContainer=reltio-data-exports&azureStoragePath=/entities/john.doe/2020.07.01
Authorization Bearer {token}
Content-Type application/json
Storage permissions
When exporting to the customer bucket with a specified path, give the credentials with Read-Write permissions for this path in the bucket. Spark Export imposes an additional requirement. The given credentials must have Read permission for every existing sub-directory of the specified path and Read-Write permissions for each sub-directory that must be created.
s3:PutObject
s3:ListBucket
s3:DeleteObject
s3:GetObject
Encryption
The encryption in Reltio S3 bucket is an environment configuration. When you export to the customer bucket, you can pass thes3Sse
parameter to enable encryption. The s3Sse
parameter has two possible values, namely AES256
and aws:kms
. For aws:kms
, you can pass the s3SseKmsKey
parameter with the custom KMS key.Export relationship details
When exporting relationship details, you can opt to export the following details:
- startDate
- endDate
- startRefPinned
- startRefIgnored
- endRefPinned
- endRefIgnored
To export these details, the following parameters must be configured in your tenant configuration:
{
"exportConfig": {
"output": {
"csv": {
"includeRelationActiveness": true,
"includeRelationPinnedIgnored": true
}
}
}
}
The following table gives more details about these new parameters:
Parameter | Required | Description |
---|---|---|
includeRelationActiveness | No | If set to true, the following columns are exported to the CSV file:
By default, this is set to |
includeRelationPinnedIgnored | No | If set to true, the following columns are exported to the CSV file:
By default, this is set to |