Create one entity from multiple records
Learn about creating a single entity by merging multiple records that have the same crosswalk in the Data Loader.
The Data Loader, by default, creates a single entity by consolidating multiple records that have the same crosswalk.
The example below shows some records to be uploaded:
FirstName | Birthdate | Crosswalk | |
---|---|---|---|
John | john@gmail.com | 2/10/1980 | ABC |
Johnny | john@gmail.com | ABC | |
Floyd | floyd@gmail.com | 2/03/1982 | XYZ |
Based on the example above, the first two records from the same crosswalk (ABC) are to be merged into a single entity, as shown below:
FirstName | Birthdate | Crosswalk | |
---|---|---|---|
John, Johnny | john@gmail.com | 2/10/1980 | ABC |
Floyd | floyd@gmail.com | 2/03/1982 | XYZ |
Given that the FirstName attribute has two unique values, these are grouped to have two values, John and Johnny in the merged record. As a result, the attribute becomes a multi-value attribute.
Merging records: key considerations
Here are some considerations when deciding to group multiple records into a single entity.
-
Merge records can come from a single file or multiple files. For example, if you have an entity with records in the first row of one file and in the second row of another file, those rows are merged, even if they are in different files.
-
Source file formats must be CSV or XLSX only.
-
RELTIO_JSON input files aren't supported for consolidating/grouping records.
-
The total size of all source files to be loaded in any job cannot be higher than 10GB. If this functionality is disabled there is no size limitation.
-
During the process of merging/grouping rows, if the total of values for an attribute is greater than 10,000, the data loading job fails with a status of ERROR. For more information, see topic Quota and limits.
Group multiple records into a single entity
By default, the Data Loader is configured in a tenant to create a single entity from multiple records, running a reprocessing step. To improve the Data Loader job's performance, change this setting to run jobs to merge records for specific projects.
-
If you disabled this functionality per tenant and want to enable it per project then you have to create/update the project and set the parameter
groupingEnabled
to true:
See topic Data Loader API.PUT {{datalaoder_uri}}/dataloader/api/{{tenantId}}/project/{{projectId}} Body: { "groupingEnabled": true }
-
To enable/disable this capability on a tenant level, contact Support. For details, see topic Get help in Support Portal
-
Enable or disable grouping in the Data Loader in the 3 - Define step, when you create a data loading job. To enable this capability select Consolidate records with same crosswalk value