Accelerate the Value of Data

NPI Cleansers

Learn about the prebuilt cleansers to cleanse and standardize the source data.

The NPI data tenant has prebuilt cleansers to cleanse and standardize the source data. These cleansers detect data quality issues and correct inaccurate, incomplete, and irrelevant parts of the data. After cleansing, the data set is more consistent, enhanced, standardized, and complete.

The NPI data tenant has the following prebuilt cleansers:

Table 1. NPI Cleansers
Cleanser Name Function
hcpNameCleanser

It cleanses the legal and alternate first name, middle name, and last name of healthcare providers by applying logic such as:

  • Consider " . " (dot) and " - " (hyphen) but ignore rest of all the special and numeric characters in a name.

    For example:

    Before running the hcpnamecleanser, the HCP name is "Mariana9Arcila-Mesa".

    After running the hcpnamecleanser, the HCP name is "Mariana Arcila-Mesa".

  • Consider the space between two alphabets as a segregator between names (first name, last name, and middle name). As a result, automatically turn the first alphabet after the space in upper case .

    For example:

    Before running the hcpnamecleanser, the HCP name is "Mariana arcila-Mesa".

    After running the hcpnamecleanser, the HCP name is "Mariana Arcila-Mesa".

  • Turn the first alphabet in upper case.

    For example:

    Before running the hcpnamecleanser, the HCP name is "mariana Arcila-Mesa".

    After running the hcpnamecleanser, the HCP name is "Mariana Arcila-Mesa".

hcoNameCleanser

It cleanses the legal business name of healthcare organizations by applying logic such as:

  • Consider numeric characters.

  • Remove mathematical expressions such as " {}, [], and () " (brackets), " % " (percentage), " + " (plus), " > " (greater than), " < " (lesser than), " = " (equal to) etc. in a name.

    For example:

    Before running the hcoNameCleanser, the HCO name is "Healthstat%Onsite%Clinic<Parkdale=Plant 23".

    After running the hcoNameCleanser, the HCO name is "Healthstat Onsite Clinic Parkdale Plant 23".

  • Consider some special characters such as " # " (hash), " . " (dot), " - " (hyphen), " ' " (apostrophe) in a name.

    For example:

    Before running the hcoNameCleanser, the HCO name is "Community;Medical!Services%Montana-Private LLC".

    After running the hcoNameCleanser, the HCO name is "Community Medical Services Montana-Private LLC".

toTitleCase

It cleanses the legal and alternate prefix and suffix text used with healthcare provider’s name. It also cleanses the address attributes of healthcare providers and healthcare organizations. The logic applied is:

  • Turn the first alphabet in upper case after any of these special characters " - " (hyphen), " ," (comma), " ; " (semi-colon), " : " (colon), " . " (dot), " & " (ampersand), " / " (forward slash), " \ " (backward slash), " () " (bracket), and " "" " (inverted commas).

    For example:

    Before running the toTitleCase, the HCP suffix is "-jr".

    After running the toTitleCase, the HCP suffix is "Jr".

cleansedCredential

It cleanses the credentials of healthcare providers by applying logic such as:

  • Remove " . " (dot).

    For example:

    Before running the cleansedCredential, the HCP credential is "M.D.".

    After running the cleansedCredential, the HCP credential is "MD".

  • Consider two credentials if there is a " - " (hyphen), " , " (comma), and space between credential alphabets. The credential alphabets after a hyphen or comma or a space must be two or more. Further, segregate these two or more credentials by a " | " (pipe) symbol.

    For example:

    Before running the cleansedCredential, the HCP credential is "MD,LCSW-RN".

    After running the cleansedCredential, the HCP credentials are "MD | LCSW | RN".

phoneCleanser

It validates the phone number. If validated, it populates the additional information.

For example:

Before running the phoneCleanser, the phone number is "(603) 382-4972 Business".

After running the phoneCleanser, the validation status is VALID. The additional information or attributes that populates are Country Code, Area Code, Local Number, Line Type, Format Mask, Digit Count, Geo Area, and Geo Country.

addressCleanser