Using String Function Cleanser and Address Cleanser together
Learn how to combine the StringFunctionCleanser and the AddressCleanser.
You may use the StringFunctionCleanser
and the AddressCleanser
one after the other. Here's an example of how the two cleansers work together when you configure the string cleanser before the address cleanser:
Sample Address
{
"Address1": "12021 Wilmington (Ave) 1st$ Fl",
"City": "Los Angeles",
"State": "CA",
"Country": "United States",
"Pincode": "90059"
}
First, the StringFunctionCleanser
removes special characters. Then, the AddressCleanser
standardizes and verifies the address against geolocation data.
Address Output after Applying the String Cleanser
Applying the StringFunctionCleanser
to the sample address produces this output:
{
"Address1": "12021 Wilmington Ave 1st Fl",
"City": "Los Angeles",
"State": "CA",
"Country": "United States",
"Pincode": "90059"
}
Address Output after Applying the Address Cleanser
Applying the AddressCleanser
to the address, after the string cleanse, produces this output:
{
"PremiseNumber": "12021",
"DeliveryAddress": "12021 Wilmington Ave Fl 1",
"Address": "12021 Wilmington Ave Fl 1<BR>Los Angeles CA 90059-3019",
"DeliveryAddress1": "12021 Wilmington Ave Fl 1",
"Locality": "Los Angeles",
"AdministrativeArea": "CA",
"CountryName": "United States",
"PostalCode": "90059-3019",
"PostalCodePrimary": "90059",
"PostalCodeSecondary": "3019",
"Latitude": "33.923550",
"SubBuilding": "Fl 1",
"Premise": "12021",
"Address1": "12021 Wilmington Ave Fl 1",
"Address2": "Los Angeles CA 90059-3019",
"ISO3166-2": "US",
"ISO3166-3": "USA",
"GeoAccuracy": "P4",
"GeoDistance": "0.0",
"ISO3166-N": "840",
"Thoroughfare": "Wilmington Ave",
"Longitude": "-118.239390",
"AVC": "V44-I55-P7-100",
"PremiseNumberStatus": "fsVerifiedNoChange",
"LocalityStatus": "fsVerifiedNoChange",
"SubAdministrativeArea": "Los Angeles",
"AdministrativeAreaStatus": "fsAdded",
"ThoroughfareStatus": "fsVerifiedNoChange",
"MatchRuleLabel": "1a"
}
Recommendation
To achieve the desired result, we recommend you use different resultingValuesSourceTypeURI
(crosswalk) for StringFunctionCleanser
and AddressCleanser
in cleanseConfig
. For example:
{
"cleanseConfig": {
"mappings": [
{
"uri": "configuration/entityTypes/Location/cleanse/mappings/address",
"outputMapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/VerificationStatus",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "VerificationStatus"
},
{
"attribute": "configuration/entityTypes/Location/attributes/VerificationStatusDetails",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "VerificationStatusDetails"
},
{
"attribute": "configuration/entityTypes/Location/attributes/StateProvince",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "AdministrativeArea"
},
{
"attribute": "configuration/entityTypes/Location/attributes/City",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Locality"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Country",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "ISO3166-2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Zip/attributes/Zip5",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "PostalCodePrimary"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Zip/attributes/Zip4",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "PostalCodeSecondary"
},
{
"attribute": "configuration/entityTypes/Location/attributes/GeoLocation/attributes/Latitude",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Latitude"
},
{
"attribute": "configuration/entityTypes/Location/attributes/GeoLocation/attributes/Longitude",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Longitude"
},
{
"attribute": "configuration/entityTypes/Location/attributes/GeoLocation/attributes/GeoAccuracy",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "GeoAccuracy"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Organization",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Organization"
},
{
"attribute": "configuration/entityTypes/Location/attributes/DeliveryAddress1",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "DeliveryAddress1"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine1",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "DeliveryAddress1"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine2",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "DeliveryAddress2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/ISO3166-2",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "ISO3166-2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/ISO3166-3",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "ISO3166-3"
},
{
"attribute": "configuration/entityTypes/Location/attributes/DeliveryAddress",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "DeliveryAddress"
},
{
"attribute": "configuration/entityTypes/Location/attributes/PremiseNumber",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "PremiseNumber"
},
{
"attribute": "configuration/entityTypes/Location/attributes/ISO3166-N",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "ISO3166-N"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Unmatched",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Unmatched"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AVC",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "AVC"
}
]
}
],
"infos": [
{
"uri": "configuration/entityTypes/Location/cleanse/infos/other",
"useInCleansing": true,
"sequence": [
{
"chain": [
{
"cleanseFunction": "Loqate",
"resultingValuesSourceTypeUri": "configuration/sources/ReltioCleanser",
"proceedOnSuccess": true,
"proceedOnFailure": false,
"mapping": {
"inputMapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine1",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "Address1"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine2",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Address2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Country",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Country"
},
{
"attribute": "configuration/entityTypes/Location/attributes/StateProvince",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "AdministrativeArea"
},
{
"attribute": "configuration/entityTypes/Location/attributes/City",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Locality"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Zip/attributes/Zip5",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "PostalCode"
},
{
"attribute": "configuration/entityTypes/Location/attributes/Organization",
"mandatory": false,
"allValues": false,
"cleanseAttribute": "Organization"
}
],
"outputMappingRef": "configuration/entityTypes/Location/cleanse/mappings/address/outputMapping"
},
"params": {
"verificationStatusMapping": {
"Verified": [
"V(4|5).*"
],
"Partially Verified": [
"V(1|2|3).*",
"P.*"
],
"Unverified": [
"U.*"
],
"Ambiguous": [
"A.*"
],
"Conflict": [
"C.*"
],
"Reverted": [
"R.*"
]
}
}
},
{
"cleanseFunction": "StringFunctionCleanser",
"resultingValuesSourceTypeUri": "configuration/sources/StringCleanser",
"proceedOnSuccess": true,
"proceedOnFailure": true,
"mapping": {
"inputMapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine1",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "AddressLine1"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine2",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "AddressLine2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/City",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "City"
}
],
"outputMapping": [
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine1",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "AddressLine1"
},
{
"attribute": "configuration/entityTypes/Location/attributes/AddressLine2",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "AddressLine2"
},
{
"attribute": "configuration/entityTypes/Location/attributes/City",
"mandatory": true,
"allValues": false,
"cleanseAttribute": "City"
}
]
},
"params": {
"casing": "Title"
}
}
]
}
]
}
]
}
}
String and address cleanser parameters
Here's a table that explains how the configuration in cleanseConfig
works:
Cleanse Configuration Parameter | Description |
---|---|
"resultingValuesSourceTypeUri": "configuration/sources/StringCleanser" | The StringCleanser crosswalk is used for string cleansing. The output obtained is the address cleansed for the special characters by the StringFunctionCleanser . |
"resultingValuesSourceTypeUri": "configuration/sources/ReltioCleanser" | The ReltioCleanser crosswalk uses the cleansed value from the StringCleanser crosswalk as the source. The AddressCleanser further cleanses the entire address for verification and Geo-location. |
AddressCleanser
before applying the StringFunctionCleanser
, place the AddressCleanser
first in the config sequence, followed by the StringFunctionCleanser
. We recommend configuring the first cleanser in the chain with proceedOnSuccess
set to true and proceedOnFailure
set to false. This ensures the process continues only if the first cleanser succeeds, preventing errors from affecting subsequent cleansing steps..