Accelerate the Value of Data

Create D&B Batch Job

Learn about the operation to create D&B batch jobs.

The create D&B batch job operation creates a new job to send data to D&B. The batch job sends data to D&B by using Direct 2.0 or Direct+ API. To Login to Direct 2.0, the D&B-User and D&B-Password must be the FTP credentials and to login to Direct+ you need to use the API Direct+ credentials. You can create periodic or deferred batch jobs by using cronExpression header parameter.

If you provide Amazon S3 parameters, the connector archives files to S3 by using the provided keys, bucket, and path. The file name is the job identifier. The forceS3 is an optional parameter to load files directly from Amazon S3.
Note: The connector tries to upload the request and response files to S3, but this is not mandatory. Any errors that occur during S3 processing are reported to the stack driver, but the process is not interrupted.

Request

POST {DnBConnectorUri}/batch
Table 1. Parameters
Parameter Required RequiredDescription
HeadersAuthorizationYesYesReltio access token in the form Bearer: <<token>>, seeAuthentication API.
Note:
  • Batch is a long process. The usual Reltio token doesn't serve here. Generate the token using client: dnb-connector.
  • It's always better to generate a fresh token for the process. Obtain the token, revoke it, and obtain the token again to make sure you have a fresh, five days long token.
Content-TypeYesYesMust be Content-Type: application/json.
EnvironmentUrlYesYesEnvironment URL of the tenant.

For example: https://dev.reltio.com

TenantIdYesYesTenant ID: data is extracted from this tenant. Value: {{tenantID}}.
DnB-FtpNoN/ADnB FTP host

For example: ftp.dnb.com

DnB-Path-PutNoN/APut directory in the DnB FTP site
Note: The default value is /puts.
DnB-Path-GetNoN/AGet directory in the DnB FTP site
Note: The default value is /gets.
DnB-UserYesYesDnB FTP login
DnB-PasswordYesYesDnB FTP password
notificationEmailsNoNoEnables email notifications for FAILED job cases.

You can add email addresses separated by comma (,) to the list of recipients for receiving the notifications. For example, smith.joe@company.com,john.snow@company.com

s3BucketNoNoS3 bucket if you use S3. For example: s3.testbucket
Note: You must pass all S3 parameters.
s3PathNoNoS3 path. For example: batch/aaa/bbb
Note: You must pass all S3 parameters.
awsAccessKeyNoNoAWS access key
Note: You must pass all S3 parameters.
awsSecretKeyNoNoAWS secret key
Note: You must pass all S3 parameters.
S3RegionNoNoS3 Region. You must pass all S3 parameters or nothing. The default value is us-east-1.
mergeOrgsWithSameDUNSNoNoSometimes the connector applies data from D&B DUNS number for the entity already occupied by another entity. If the flag is true, the connector updates and merges the entities in the result. Otherwise, the entity is marked with an error.

The default value is false

mergeOrgsWithSameDUNSByPotentialMatchesNoNoIn the event of a URI mismatch, the operation is marked as successful, and entities presented as potential matches. To achieve this, add them to the tenant with a crosswalk value - duns/target uri.

The default value is false.

minConfidenceCodeNoNoThe D&B Connector returns confidence code for each processed entity in the range 1 to 10, where 10 means absolutely confident. All entities having the confidence code less than minConfidenceCode are marked with an error Low Confidence Score.

The default value is 7.

waitResponseSecondsNoN/AThe D&B connector uploads requested files to D&B FTP site and periodically checks them for responses. If there are no responses after the waitResponseSeconds second, the requested files are marked as Expired.

The default value is 86400.

cronExpressionNoNoD&B connector provides the ability of scheduling batch jobs by using standard cron expressions. A Cron expression expects time in the UTC time zone. Therefore, convert your local time to the UTC time zone before you start a job.
  • You cannot create a schedule that runs batch jobs very frequently as it may conflict with the interval configured in instance parameter cron.job.min.interval in seconds.
  • The number of scheduled jobs per tenant must not exceed the number as configured in the connector instance environment variable cron.jobs.tenant.limit.
productIdN/ANoThe D&B connector submits D&B job with a specified product identifier. The available values are cmpelf or cmpelk and the default value is cmpelf.
productVersionN/ANoThe D&B connector submits D&B job with a specified product version. Please use the product version only with cmpelk product Id. The default value is v1.

For more information, see D&B Direct+ Documentation.

QueryforceNoNoGenerally, you can have only one batch job for a tenant. A simultaneous batch job can interfere with each other and cause undefined behavior in the result. But for testing purposes or if you are confident working on independent entities, you can start multiple batch jobs with force=true.
plusNoNoIf the value of this parameter is set to true, the connector starts the batch job as DnB Direct+ multiprocess API. The default value is false.
BodyYesYesContext JSON for the batch job. The batch job contains three main tasks: export, put files task, and get files task. You can start from any task by specifying the corresponding task.

Response

Job identifier and the message about status or error.

Example 1: Batch job from the beginning - You have only the tenant and filter expression describes the entities that need enrichment.

Start a batch job from scratch with export data from the tenant. You must specify the export filter in the request.

Request

Start batch job from scratch with export data from tenant
{
    "exportTask": {
        "filter": "(startsWith(attributes.Name,'R')))"
    }
}

For parallel export in D&B Connector, please specify the following request parameters:

Start batch job from scratch with export data from a tenant by using the filter with parallel tasks
{
  "exportTask": {
    "filter": "(startsWith(attributes.Name,'T')))",
    "distributed": true,
    "taskPartsCount": 300
  }
}
Note: The D&B Connector does not verify the values for export parameters. If taskPartsCount is negative or too large, it is passed to api.

Example 2: You already have export started or completed and you would like to use it

Note: If you have distributed export, you must specify all ids or one master id. In any case, export id(s) must be specified as a list.

Request

Batch job based on external export Id
{
    "putFilesTask": {
        exportTaskIds": ["41949723-470b-4d4d-9c9f-4fa35de0447d"],
        "processedEntities": 1000
    }
}

The exportTaskIds are ids of your export tasks. You can omit processedEntities if you do not require offset from the beginning of the export file.

Example 3: You already have files on FTP/S3 and only need to upload them back to Reltio

Request

Get files task
{
    "getFilesTask": {
        "fileEntries": [
            {
                "filename": "5741031244955648.ref",               
                "status": "UPLOADED",
                "forceS3": true
            },
            {
                "filename": "5741031244955648.glb",               
                "status": "UPLOADED"
            },
            {
                "filename": "5741031244955648_1.glb",               
                "status": "UPLOADED"
            }
        ]
    }
}

These are the file entries to upload to Reltio. Use the status UPLOADED which means the files were uploaded to FTP and the connector must check the responses. The first entry has the forceS3 flag as true which means the connector downloads that file directly from S3.

Note: If you use forceS3, you must have all S3 header parameters filled with valid values.

Response

{
    "jobId": 5741031244955648,
    "success": "OK",
    "message": "Scheduled"
}

Example 4: You already have D&B multiprocess jobs and only need to upload them back to Reltio

Request - Get Files Task
{
    "getFilesTask": {
        "fileEntries": [
            {
                "jobId": "some-exists-job-id"
                "filename": "arbitrary file name",               
                "status": "UPLOADED",
                "forceS3": false
            },
            {
                "jobId": "some-DnB-API-job-id",
                "status": "UPLOADED"
            }
        ]
    }
}
Response
{
    "jobId": 5741031244955648,
    "success": "OK",
    "message": "Scheduled"
}

Example 5: You need a periodic job or a job that starts in future

Any request from one of the previous examples plus the header parameter cronExpression with a valid cron expression. For example, every Monday at noon expression will be: 0 0 12 ? * MON.