Accelerate the Value of Data

Potential Matches Cassandra Es Consistency Task

Compares potential matches in the main and search storages

This task compares potential matches between the main and search storages and resolves basic inconsistencies if found.

Note: Stop and Pause are supported. In case of Pause, the task restarts from the beginning.

Body

If provided, this will process the objects specified in the JSON array.

[
    "entities/Uri1",
    "entities/Uri2",
    ...
    "entities/UriN",
]

Requests:

  • Admin role is required:
    POST {ApplicationURL}/potentialMatchesEsCassandraConsistencyCheck
  • Tenant admin role is required:
    POST {ApplicationURL}/api/{tenantId}/potentialMatchesEsCassandraConsistencyCheck&fast=true
Note: The fast=true option is deprecated. Therefore, use the MemorySafePotentialMatchesCassandraEsConsistency task instead.
Table 1. Parameters
Parameter Required Description
tenantId Yes ID of the tenant to compare matches and entities.
entityTypeNoThe entity type to be checked (all types will be checked if this parameter is absent).
maxResultsToStoreNoThe task stores URIs of the entities, for which inconsistency was found, in its status. This parameter is required to prevent huge consumption of memory when a large number of entities with inconsistency are found. Default value: 100.
compareVersionsNoIf set to true, then the version of the objects in the main and search storages will also be compared.. Default is false.
fixInconsistencyNoIf set to true, the task will fix inconsistencies. Default is true.
fixVersionConflictsNo

If the parameter is set to true, then the task will reindex entities with version conflicts in ES. Default is false.

waitForQueueOnStartingNoIf this parameter is set to true, the task will wait for queues to be empty before starting. By default, this is set to false.
distributedNoIf you set this parameter to true, the task will be executed in the distributed mode. By default, this is set to false.
tasksPartsCountNoThe number of tasks that will be created for distribute processing. Each tasks will process its own part of objects, which may be executed on different API nodes in parallel. We recommend you enter a value that is equal to the total count of API nodes that will execute the tasks. This field defaults to 2.
Note: Use this parameter only in the distributed mode.
largeVersionThresholdNoThe version of the threshold in which to flag objects that have a large version. All objects with a version whose threshold is more than what is specified here is reported in the objectsAboveVersionThreshold field. The total number of objects that have a version above this threshold is reported in the totalObjectsAboveVersionThreshold field. The default value is 2^60.
Table 2. State fieldsThe following additional fields are available in the task state:
FieldDescription
numberOfFilteredOutObjectsThe total number of entities without potential matches.