Accelerate the Value of Data

Fuzzy Filtering

You can apply fuzzy search logic using Fuzzy filtering.

Fuzzy Search Background

The fuzzy match operator is useful to match terms even when there are minor differences between the query and indexed data. This allows you to find appropriate matches even when you make a typographical error or minor spelling errors.

Fuzzy is a type of full-text search. It helps you to search for terms in the query for all the values of the requested attributes.

A negative consequence of implementing the fuzzy match operator is when two similar terms with different meanings cause false matches.

For example, the words bread and read will be matched with a fuzzy operator as they are almost similar. Although these words are unrelated semantically.

Fuzzy Matching Algorithm

The fuzzy match operator uses Damerau-Levenshtein edit distance to determine if a search term matches with a term in a document. It is based on the changes of the number of characters required to make the query string match with the indexed term. Changes include inserting a single character, deleting a single character, and transposing two adjacent characters.

The number of changes allowed to be considered a match depends on the length of the search term:

  • Search terms for one to two characters long use exact match.
  • Search terms three to five characters long will match if only one change is required.
  • Search terms six characters or longer will match if one or two changes are required.

The fuzzy transpositions are enabled using Damerau-Levenshtein edit distance. This indicates that the neighbouring letters are swapped in one single edit. For example, for the original value of the string Alexander when you are searching for the string alexnader (with an->na are swapped) it will be recognized as a single edit.

For example, a search term of the string Smith is 5 characters long, and so will only match terms that differ by a single change. It matches with the string Smth after deleting one character, i. Also if a character s is inserted at the end of the word Smiths and while transposing one pair of characters (the i and m) as Simth.

If requests include multiple terms, fuzzy search will try to match each of the terms using fuzzy logic.

Note: You can have all the terms in one attribute value or multiple attribute values. If the request includes one or two terms then both these terms must match. Whereas, if there are multiple terms then 80% of the terms must match.

Example: An entity with two values for Name attribute

Consider an entity with two values for the Name attribute.


   "Name":[
   {"value": "Atticus Constantine Dominic Gabriel"},
   {"value": "Harrison Jamison Jonathan Julian Hamilton"},
]
Each of the terms matches by using fuzzy logic, even if they are not located inside one value of the attribute names.
...?filter=fuzzy(attributes.Name,'AttiKuZ HamElDon')

In the following filter, the entity will not be found, as both terms must have matches.

...?filter=fuzzy(attributes.Name,'NotExistingName Hamilton')
Similarly, in the following filter the entity will not be found, as there will be less than 80% of the matches.
...?filter=fuzzy(attributes.FirstName,'AttiKuZ JamiCoM DomEniK CabrEel HamElDon  NotExistingName1 AnotherName WhoIsThat')

Search URLs

The fuzzy match operator can be specified in the search URL. It compares the specified attribute to the value. Fuzzy search logic can be applied to a name, address, or any string attribute searches.

...?filter=fuzzy(<attribute>,<value>)

You can find the following entities in which the attribute Name is closer to the value Smith.

{{TenantURL}}/entities?filter=fuzzy(attributes.Name,'Smith')

The following example (more advanced) describes the URL that fetches entities of the type Individual with a fuzzy match on the last name Smith and a first name which starts with the word Jim.

{{TenantURL}}/entities?filter=equals(type,'configuration/entityTypes/Individual')%20and%20fuzzy(attributes.LastName,'Smith')%20and%20startsWith(attributes.FirstName,'Jim')&select=uri,label,type,attributes.FirstName,attributes.Address,attributes.LastName,secondaryLabel,defaultProfilePicValue&max=25&offset=0&scoreEnabled=false
Note: Currently, fuzzy search is only supported by the Search API calls.

Additional Examples

See the additional examples of the fuzzy match operator when the results match with the text alexander smith in the table Table 1: Fuzzy Match Operators and Results.
Table 1. Fuzzy Match Operators and Results
Filter Error Matches
fuzzy(attributes,'alexander') Original term Y
fuzzy(attributes,'alexander') ‘x’ → ‘ks’ Y
fuzzy(attributes,'ALEKSANDER') Upper case Y
fuzzy(attributes,'alexandr') Missing ‘e’ at end Y
fuzzy(attributes,'alexnader') Transpose (‘an’ → ‘na’) Y
fuzzy(attributes,'alexanrd') Too many errors N
fuzzy(attributes,'aleksander smith') Original text with 2 words Y
fuzzy(attributes,'aleksander brown') Wrong second term N
fuzzy(attributes,'alexandr smit) One error in each word Y

Fuzzy Filtering

Fuzzy search logic can be applied to a name, address, or any String attribute search. Therefore, when you enter attribute values, the entities with the variation of the search keywords are searched using fuzzy logic.

Fuzzy Search Example

For example, you have records displayed with the names of corporations:

  • National Corporation
  • China National Petroleum Corporation

When you search for National Corporation, even if you have misplaced two letters (for example, Natonal Corpration), both records would be displayed as search results. The National Corporation is placed at the top of the list as a more exact match.

Search Before Create API

The fuzzy search option for Search Before Create (SBC) allows you to perform a search, even if you have made incorrect entries for search values. SBC returns non-empty results despite any typographical errors you may have made.

For example, when you use the useFuzzy option, entities are returned even if there is an error in the filter value, as shown in this example Request:

Request

{{sbc-url}}/{{tenantId}}/{{profileName}}/search?searchIn=ct&sObject=Account&recordTypeId=
{{recordTypeId}}&filter=FirstName:Ttayana&options=useFuzzy