Unify and manage your data

Potential Matches Cassandra Es Consistency Task

Compares potential matches in the main and search storages

This task compares potential matches between the main and search storages and resolves basic inconsistencies if found.

Note: Stop and Pause are supported. In case of Pause, the task restarts from the beginning.

Body

If provided, this will process the objects specified in the JSON array.

[
    "entities/Uri1",
    "entities/Uri2",
    ...
    "entities/UriN",
]

Requests:

  • Tenant admin role is required:
    POST {ApplicationURL}/potentialMatchesEsCassandraConsistencyCheck
  • Tenant admin role is required:
    POST {ApplicationURL}/api/{tenantId}/potentialMatchesEsCassandraConsistencyCheck&fast=true
Note: The fast=true option is deprecated. Therefore, use the MemorySafePotentialMatchesCassandraEsConsistency task instead.
Table 1. Parameters
Parameter Required Description
tenantId Yes ID of the tenant to compare matches and entities.
entityTypeNoThe entity type to be checked (all types will be checked if this parameter is absent).
maxResultsToStoreNoThe task stores URIs of the entities, for which inconsistency was found, in its status. This parameter is required to prevent huge consumption of memory when a large number of entities with inconsistency are found. Default value: 100.
compareVersionsNoIf set to true, then the version of the objects in the main and search storages will also be compared.. Default is false.
fixInconsistencyNoIf set to true, the task will fix inconsistencies. Default is true.
fixVersionConflictsNo

If the parameter is set to true, then the task will reindex entities with version conflicts in ES. Default is false.

waitForQueueOnStartingNoIf this parameter is set to true, the task will wait for queues to be empty before starting. By default, this is set to false.
distributedNo
If set to true, the task runs in distributed mode. Default value is false. For more information, see Distributed mode.
taskPartsCountNo
Specifies the maximum number of sub-tasks for distributed execution. The platform determines the optimal number based on performance limits. Default value is 2.
Note: This parameter is only applicable when distributed=true. Otherwise, it s ignored.
largeVersionThresholdNoThe version of the threshold in which to flag objects that have a large version. All objects with a version whose threshold is more than what is specified here is reported in the objectsAboveVersionThreshold field. The total number of objects that have a version above this threshold is reported in the totalObjectsAboveVersionThreshold field. The default value is 2^60.
Table 2. State fieldsThe following additional fields are available in the task state:
FieldDescription
numberOfFilteredOutObjectsThe total number of entities without potential matches.