Accelerate the Value of Data

Entity crosswalks consistency task

Learn how to check a tenant for entities that have duplicate crosswalks.

The main reason for this inconsistency is that multiple entities with the same crosswalks are loaded simultaneously while using two parallel processes. Normally, this inconsistency is fixed automatically by the API after the first request from the entities by the crosswalk responsible for the duplication. This task finds all entities with the same crosswalks and executes getting by crosswalks for those that caused the entities to be merged.

This task can restore missing records for crosswalks that exist in an entity but are missing in the EXTERNAL_CROSSWALKS column family, fix duplicate crosswalks, and remove extra records that exist in the EXTERNAL_CROSSWALKS column family but where related crosswalks are absent in the entity.

Note: Stop and Pause are supported.

Request

POST {ApplicationURL}/duplicateCrosswalksCheck
If the request body is provided, then only the entities specified in the JSON array are processed.
POST {ApplicationURL}/duplicateCrosswalksCheck?tenantId=<TENANT_ID>
[
    "entities/Uri1",
    "entities/Uri2",
    ...
    "entities/UriN",
]

You can see the following parameters in the task result:

Table 1. Parameters
Parameter Required Description
tenantId Yes The ID of the tenant that you use to check for duplicate crosswalks.
checkDeletedCrosswalksNoStage one of the task. If the parameter is set to true, then the task restores missing records for crosswalks that exist in the entity but are missing in the EXTERNAL_CROSSWALKS column. The default value is false.
resolveDuplicatesNoStage two of the task. If the parameter is set to false, then the task won’t fix duplicate entities using their crosswalks. The default value is true.
checkExistenceNoStage three of the task. If the parameter is set to true, then the task removes extra records that exist in the EXTERNAL_CROSSWALKS column. Related crosswalks are absent in the entity. The default value is false.
readOnlyNoIf the parameter is set to true, then any crosswalk inconsistencies that are found are only reported and not fixed. The default value is false.
maxResultsToStoreNoThis parameter stores the crosswalks of the entities for which inconsistencies have been found in its status. It is needed to prevent consuming a huge volume of memory when many entities with inconsistencies are found. The default value is 500.
maxSubResultsToStoreNoThis parameter stores the entity IDs for which inconsistencies have been found in its status. It limits the number of paired entities with the corresponding crosswalk to reduce memory usage. The default value is 10.
The task result contains these fields and may also contain some extra fields explained below:
Table 2. Output fields
Field Output of Description
missedExternalCrosswalks stage oneThis field displays the missing crosswalks in the EXTERNAL_CROSSWALKS column. For each crosswalk, a corresponding list of entities that contains the crosswalk is displayed.

If the readOnly parameter is set to false, then the missing crosswalks are restored in the EXTERNAL_CROSSWALKS column.

missedExternalCrosswalksCountstage oneThis field displays the number of missing crosswalks in the EXTERNAL_CROSSWALKS column.
processedTOObjectsCountstage oneThis field displays the total number of entities processed during stage one of the task.
entitiesWithSameCrosswalksstage twoThis field displays the (duplicate) crosswalks found in more than one entity and the corresponding entities that contain the crosswalks.

If the readOnly parameter is set to false, then all entities that contain the same crosswalks are merged.

failedToMergeObjectsstage twoThis field displays all duplicate crosswalks that didn’t merge and the corresponding entities that contain the crosswalks.
failedToMergeObjectsCountstage twoThis field displays the number of times the entities with duplicate crosswalks failed to merge.
entitiesWithSameCrosswalksCountstage twoThis field displays the number of duplicate crosswalks.
redundantExternalCrosswalksstage threeThis field displays the crosswalks that exist in the EXTERNAL_CROSSWALKS column but don’t exist in an entity. For each crosswalk, a corresponding list of entities that contains the crosswalk is displayed.

If the readOnly parameter is set to false, then the crosswalks are removed from the EXTERNAL_CROSSWALKS column.

redundantExternalCrosswalksCountstage threeThis field displays the number of crosswalks that exist in the EXTERNAL_CROSSWALKS column but do not exist in an entity.
numberOfProcessedObjectsstages one, two,threeThis field displays the total number of crosswalks processed across all three stages of the task.

Response

{
  "id" : "43d65edd-62ab-4193-ab86-bd7637028c2a",
  "groupId" : "39c6d665-7c81-4be3-afad-c3fea451aa87",
  "createdTime" : 1660049801853,
  "createdBy" : "admin",
  "updatedTime" : 1660049801853,
  "updatedBy" : "admin",
  "type" : "com.reltio.businesslogic.tasks.consistency.RelationCrosswalksConsistencyTask",
  "status" : "COMPLETED",
  "name" : "Checking crosswalks consistency for tenant Merill",
  "createdOnHost" : "some-host",
  "executedOnHost" : "some-host",
  "parallelExecution" : false,
  "nodesGroup" : "test",
  "startTime" : 1660049802702,
  "endTime" : 1660049802885,
  "parameters" : {
    "tenantId" : "Merill",
    "uriList" : "",
    "maxResultsToStore" : "500",
    "maxSubResultsToStore" : "10",
    "ignoreEventsInStreaming" : "true",
    "checkDeletedCrosswalks" : "true",
    "resolveDuplicates" : "true",
    "checkExistence" : "true",
    "readOnly" : "false"
  },
  "currentState" : {
    "numberOfFailedToPublishEvents" : 0,
    "entitiesWithSameCrosswalksCount" : 1,
    "missedExternalCrosswalks" : {
      "m_001.FB.crosswalkValue1" : [ "relations/0000Aef", "relations/00006OP" ]
    },
    "failedToMergeObjectsCount" : 1,
    "nonExistingCrosswalksFound" : 0,
    "redundantExternalCrosswalksCount" : 0,
    "redundantExternalCrosswalks" : { },
    "failedToMergeObjects" : {
      "CrosswalkTO {type: configuration/sources/FB, sourceTable: null, value: crosswalkValue1}" : [ "relations/0000Aef", "relations/00006OP" ]
    },
    "missedExternalCrosswalksCount" : 2,
    "lastHourThroughput" : 0.0,
    "entitiesWithSameCrosswalks" : {
      "m_001.FB.crosswalkValue1" : [ "relations/0000Aef", "relations/00006OP" ]
    },
    "numberOfProcessedObjects" : 3,
    "processedTOObjectsCount" : 2,
    "status" : "Completed"
  },
  "throughput" : 0.0,
  "duration" : "0s"
}