Accelerate the Value of Data

Memory Safe Cassandra Es Consistency Task

Compares entities in the main and search storages

This task compares entities between the main and search storages and resolves basic inconsistencies if found. If some entities are present in the search but not present in the main data storage, this task removes these entities from the search storages. If some entities are present in the main storage but not present in search, task tries to reindex such entities.

Note: Stop and Pause are supported. In case of Pause, the task restarts from the beginning.

Requests:

  • Admin role is required:
    POST {ApplicationURL}/esCassandraConsistencyCheck
  • Tenant admin role is required:
    POST {ApplicationURL}/api/{tenantId}/esCassandraConsistencyCheck

Body

If provided, this will process the objects specified in the JSON array.
[
    "entities/Uri1",
    "entities/Uri2",
    ...
    "entities/UriN"
]
Table 1. Parameters
Parameter Required Description
tenantId Yes ID of the tenant to compare matches and entities.
entityTypeNoThe entity type to be checked (all types will be checked if this parameter is absent).
maxResultsToStoreNoThe task stores URIs of the entities, for which inconsistency was found, in its status. This parameter is required to prevent huge consumption of memory when a large number of entities with inconsistency are found. Default value: 1000.
compareVersionsNoIf set to true, then the version of the objects in the main and search storages will also be compared.. Default is false.
fixInconsistencyNoIf set to true, the task will fix inconsistencies. Default is true.
restoreTILsNo

If set to true, then the task will add TIL column for entities (losers) for which it is missed. Default is false.

fixVersionConflictsNo

If the parameter is set to true, then the task will reindex entities with version conflicts in ES. Default is false.

distributedNo

If set to true, the task will be run in distributed mode. Default value is false.

taskPartsCountNo

The number of tasks that will be created for distributed reindexing. Each task will reindex its own part of objects, and all of them may be executed on different API nodes in parallel.

Recommended value: count of API nodes that can execute the tasks.

Default value: 2.

This parameter can be used only in distributed mode (distributed=true), otherwise ignored.

largeVersionThresholdNoThe version threshold that is used to flag objects having a large version. Any object whose version is over this threshold is reported in the objectsAboveVersionThreshold field. The total number of objects that have a version above this threshold value is reported in the totalObjectsAboveVersionThreshold field. The default value for this field is 2^60.