Accelerate the Value of Data

S3 File Cleanser

Learn about the S3 file cleanse function.

The S3 file cleanse function is based on the properties file stored in the S3 file storage. This cleanser works as a string replacement function on words and regular expressions. The following properties are used to configure this cleanse function:

Table 1. Properties
Name RequiredDescription
bucketYesThis indicates the S3 storage bucket name.
pathYesThis is the path of the properties file.
Note: Contact the Support team to share your properties file. The Support team then uploads the properties file to S3 and shares the S3 bucket and file path details with you.

The properties file must be a text file in UTF-8 encoding format. Each line of the file can contain name=value or regular expression=value.

Example

verified=Verified
partially verified=Partially Verified
unverified=Unverified
ambiguous=Ambiguous
conflict=Conflict
reverted=Reverted
.* Status=Unknown
test\(key=Test Key
test\+dummy key=Dummy Key2

The platform supports regular expressions as the key in the S3 file cleanse property file. Add the escape character - backslash (\) - correctly if you use any of these regex or dangling meta characters - ^, $, {}, [], (), ., *, +, ?, |, <>, -, and &. See the last two key & value pairs in the above example property file. If these characters are not escaped properly, the platform might throw a PatternSyntaxException error and the cleanse process will fail.

L3 Configuration

{
  "cleanseConfig": {
    "infos": [
      {
        "uri": "configuration/entityTypes/Individual/cleanse/infos/S3FileStringReplaceCleanser",
        "useInCleansing": true,
        "sequence": [
          {
            "chain": [
              {
                "cleanseFunction": "S3FileStringReplaceCleanser",
                "proceedOnSuccess": true,
                "proceedOnFailure": false,
                "resultingValuesSourceTypeUri": "configuration/sources/ReltioCleanser",
                "mapping": {
                  "inputMapping": [
                    {
                      "attribute": "configuration/entityTypes/Individual/attributes/Name",
                      "cleanseAttribute": "Name",
                      "mandatory": true
                    }
                  ],
                  "outputMapping": [
                    {
                      "attribute": "configuration/entityTypes/Individual/attributes/Name",
                      "cleanseAttribute": "Name",
                      "mandatory": true
                    }
                  ]
                }
              }
            ]
          }
        ]
      }
    ]
  }
}