Accelerate the Value of Data

LCA Configuration

The LCA framework allows you to execute custom tasks on different events in an object life cycle.

Life Cycle Actions in Business Model

Life Cycle Actions are defined as a map between the action name and the action list in the lifecycleActions section of the configuration file.

An action can be specified in the following ways:

  • Current tenant action is specified by its name.
  • Shared action managed by Reltio is specified with a Reltio/prefix.

lifecycleActions

Example configuration:

"lifecycleActions": {
    "rawDataBeforeCleanse": [
      "FirstAction",
      "SecondAction",
      "Reltio/CommonAction",
      "Lambda/BinaryJSON/LambdaAction"
    ],
    "afterSave": [
      "AfterSaveAction"
    ]
}

In this configuration, rawDataBeforeCleanse and afterSave are action hooks which specify list of actions to be performed. A hook can contain either a list of simple actions or a list of action groups with a filter. If an action group has a filter, it will only be executed for entities that satisfy the specified filter condition. For more information, see Conditional Execution of LCA.

Life Cycle Actions in a list are executed sequentially, the next action gets the result of the previous actions as input.

Configuration Inheritance

Life Cycle Actions configuration is inherited from a parent object. Hooks that are specified only in one of the objects are taken as is. For hooks that are specified in both objects, actions from the parent object are added before actions for the current object. For example, assume that Party has lists of actions for beforeDelete and beforeSave, and Individual (which is inherited from Party) has lists of actions for beforeSave and afterSave. In this case, the resulting Individual has the following lists of actions:
  • beforeDelete (all from the configuration of Party)
  • beforeSave (a list of actions from Party, then a list of actions from the configuration of Individual)
  • afterSave (all from the configuration of Individual)
A customer Life Cycle Actions configuration object always overrides an object in different metadata configuration layers. For example, if you specify Life Cycle Actions for Party inside L1 and L3, only the final configuration from L3 is used.
Note: When you clone a profile, the Life Cycle Action (LCA) is not initiated. The LCA is executed only if you make any modifications to the cloned profile.

Configuration Scenarios

One of the most common uses of an LCA is to interrupt the normal processing of an entity, read its current values, and then write additional data to it either before the save or after the save. A good approach is to treat the LCA as a source. The LCA has exclusive ownership and control of a crosswalk within the entity. The values contributed into the entity by the LCA through the crosswalk are then easily managed against other values in the entity using Survivorship Rules.

Important: When the LCA writes a set of data back into the entity, it MUST explicitly mention the source to use - presumably the source that was assigned to this LCA per the best practice described above. Otherwise, if a reference to the source is omitted, the values being written are associated with ALL crosswalks that already exist in the source.

Service Configuration: Storage

Reltio System Keyspace must be configured for this service. The service loads tenants (to store the hooks implementations per tenant) and stores information about each hook (configuration of the hook, statistics for the hook, and so on) in its own column families. The Reltio System Keyspace can be configured similar to the Reltio API configuration (using the API_CONFIG_NAME, FS_DIR system properties) inside Cluster-configuration/*.properties files at S3. Hooks configuration are stored in a special column family inside the System Keyspace.

Service Configuration: AWS Credentials

You also need to configure the AWS Username/Password (to have access to S3 to load configuration/hooks implementations). This can be done in the same way as for Reltio API.

Service Configuration: Logging

The Life Cycle Actions service uses distributed logging. The following properties should be configured for that:
  • DISTRIBUTED_LOG_AGGREGATOR_SERVER: Log aggregator (Logstash) server host name
  • DISTRIBUTED_LOG_AGGREGATOR_PORT: Log aggregator (Logstash) server port
  • DISTRIBUTED_LOG_LEVEL: Log level
  • DISTRIBUTED_LOG_ROOT_CATEGORY: Root category to log
Since the Life Cycle Actions service uses Elastic to search for log records, the following properties must be configured:
  • ES_HOST: Elastic host.
  • ES_CLUSTER_NAME: Elastic cluster name.
  • ES_INDEX_NAME: Elastic index name.
    Note: Date patterns can be used here for the purpose of log rotation. For example, logstash-%{YYYY.MM.dd}.
  • ES_DOCUMENT_TYPE: Elastic document type. This is an optional property. The default value is log4j.

Concurrent Actions Handling

The Life Cycle Actions service handles objects in batch requests concurrently. The environment variable THREAD_POOL_SIZE specifies the thread pool size.

Reltio API Configuration

The Reltio API server has several environment variables that are used to configure Life Cycle Actions execution:

  • LCA_RELTIO_ENVIRONMENT: URI used by the Life Cycle Actions service to access the Reltio API
  • LCA_URI: Life Cycle Actions service URI
  • LCA_OBJECT_BATCH_SIZE: Size of the batch request sent to the Life Cycle Actions service
  • LCA_THREAD_POOL_SIZE: Size of the thread pool used for invoking the Life Cycle Actions concurrently

Support for Distributed Logging

Starting with 2016.1, the Reltio Platform components use log4j SocketAppender to log data in distributed storage with dynamic configuration from S3 configuration files. Reltio Platform components support distributed logging configuration using the S3 configuration file instead of editing log4j.dita files on every component directly.

Reltio Platform Configuration

All Reltio platform components contain properties files with basic configuration. The files are located directly on the server or in S3 (which is the usual configuration). The following additional properties turn on distributed logging for the Reltio Platform:

# log aggregator (Logstash) server host name DISTRIBUTED_LOG_AGGREGATOR_SERVER=localhost 
# log aggregator (Logstash) server port. The port that should be specified in log4j input (see the Service Configuration: Logging section above.) DISTRIBUTED_LOG_AGGREGATOR_PORT=19191 
# log level DISTRIBUTED_LOG_LEVEL=WARN 
# Root categories to log DISTRIBUTED_LOG_ROOT_CATEGORY=com.reltio

If these configuration parameters are present in the configuration file, the Reltio Platform automatically adds SocketAppender to all loggers defined for categories under DISTRIBUTED_LOG_ROOT_CATEGORY with the log level set to DISTRIBUTED_LOG_LEVEL.

Basic Logstash Configuration: Reltio recommends Logstash installation completion in accordance with the following documentation:

https://www.elastic.co/guide/en/logstash/current/getting-started-with-logstash.html

Basic Logstash pipeline for the Reltio Platform can be:

input { 
 log4j { 
 "mode" => "server" 
 "host" => "0.0.0.0" 
 "port" => "19191" 
 "type" => "log4j" 
 } 
} 
output { 
 elasticsearch { 
 "hosts" => ["localhost"] 
 }
} 

Here the execution logstash server on a local machine receives the logging stream from the log4j socket appender on port 19191 and transfers log data to ElasticSearch on localhost.

LCA Execution Timeout

To ensure that custom LCA codes don’t affect Reltio’s performance, we’re timing out LCAs that take more than 100 milliseconds to run. This affects all new tenants provisioned after the 2023.2 release.

You’ll now see a message/alert in Reltio, when an LCA you created exceeds the threshold. The default timeout for Lambda and native LCAs is 100 milliseconds.

Related Videos