Accelerate the Value of Data

Inspections for Useless and Redundant Match Groups

Information about the inspections that identify useless and redundant match groups.

Overview

The following inspections are available:
  • useless match group - If a match group does not appear in any match document based on the matchDocumentsPerMatchGroup analysis, the possibility is that the match group is not being used.
  • redundant match group - If two rules are highly correlated or fully correlated based on correlation in the matchDocumentMatches analysis, there is a high possibility that these two match groups are duplicates.

Inspection for Useless Match Groups

This inspection checks if a specific match group appears in a match document. The inspection applies to single match groups and is based on the frequencies calculator in the matchDocumentsPerMatchGroup analysis. If the frequency is zero, then the match group does not appear in any match document of the processed subset of entities.

Note: There are no parameters for this inspection.
Table 1. Inspection Result Details
Parameter Description
matchGroups The list of match group URIs that did not participate in matching.
Table 2. Inspection Summary
Parameter Description
Text One or more match groups were not able to process ANY entities from the tenant.
Formatted Text One or more match groups were not able to process ANY entities from the tenant.
Severity WARNING
Parameters Id: 0, Type: Number, Value: 2 (not more than the number of match groups in the specified entity type)
Table 3. Inspection Explanation
Parameter Description
Text {0} match groups were not able to process ANY entities. Reasons include entities that are missing attributes defined in the rule, and entities that are filtered based on 'equals' and 'notEquals' logic within the rule.
Formatted Text {0} match groups were not able to process ANY entities. Reasons include entities that are missing attributes defined in the rule, and entities that are filtered based on 'equals' and 'notEquals' logic within the rule.
Parameters Id: 0, Type: number, Value: 2 (number of match groups)
Table 4. Inspection Recommendation
Parameter Description
Text Review your match group comparison formulae against the attribution found within your entities. Data profiling can be helpful here.
Formatted Text Review your match group comparison formulae against the attribution found within your entities. Data profiling can be helpful here.

Information Output

If no errors or warnings are reported, then the inspection results in an information output.

Table 5. Summary Field
Parameter Description
Text All match groups are participating in the matching process.
Severity INFO
Table 6. Explanation Field
Parameter Description
Text Great job! All match groups are processing some or all of the entities in the tenant.
Formatted Text Great job! All match groups are processing some or all of the entities in the tenant.
Table 7. Recommendation Field
Parameter Description
Text Keep going! Your match group definitions and entity attribution are aligned well.
Formatted Text Keep going! Your match group definitions and entity attribution are aligned well.

Inspection for Redundant Match Groups

The inspection checks if the two match rules are highly correlated or fully correlated. The inspection is based on the correlation calculator of the matchDocumentMatches analysis. If the correlation is close to 1, then the inspection results in an error or a warning.

Table 8. Inspection Parameters
Parameter Value Description
thresholdError 0.95 Threshold for the ERROR severity. If some of the correlation values are higher than thisvalue, then the issue has an ERROR severity.
thresholdWarning 0.90 Threshold for the WARNING severity. If some of the correlation values are higher than this value (but below thresholdError and there are no correlations with value greater than thresholdError), then the issue has a WARNING severity.
Table 9. Inspection Result Details
Parameter Description
total The number of match group pairs having correlation that exceeds the threshold (size of array in the results field).
results Array of JSON objects having the following three fields:
  • matchGroup1 - URI of the first match group
  • matchGroup2 - URI of the second match group
  • correlation - Value of the correlation between the two match groups (number)
Table 10. Inspection Summary
Parameter Description
Text {0} match rules are seemingly redundant with one or more other match rules.
Formatted Text 4 match rules are seemingly redundant with one or more other match rules.
Severity ERROR (if thresholdError is exceeded), WARNING (if only thresholdWarning is exceeded)
Parameters Id: 0, Type: Number, Value: 4
Table 11. Inspection Explanation
Parameter Description
Text The analysis shows that {0} match rules (rows) are producing largely the same match pairs as one or more other match rules. Rules felt to be redundant to one another are indicated by the dark red intersection of a row and column.
Formatted Text The analysis shows that {0} match rules (rows) are producing largely the same match pairs as one or more other match rules. Rules felt to be redundant to one another are indicated by the dark red intersection of a row and column.
Parameters
  • Id: 0, Type: number, Value: 4
  • Id: 1, Type: number, Value: 0.95
Table 12. Inspection Recommendation
Parameter Description
Text If you agree with the explanation, make modifications to broaden the range of tactics employed across the set of match rules. This usually increases the holistic effectiveness of the set.
Formatted Text If you agree with the explanation, make modifications to broaden the range of tactics employed across the set of match rules. This usually increases the holistic effectiveness of the set.

Information Output

If no errors or warnings are reported, then the inspection results in an information output.

Table 13. Summary Field
Parameter Description
Text Match group tactics are evenly distributed.
Severity INFO
Table 14. Explanation Field
Parameter Description
Text The analysis shows that your match groups have a nicely distributed set of tactics, each of which is evaluating candidate pairs differently than the others.
Formatted Text The analysis shows that your match groups have a nicely distributed set of tactics, each of which is evaluating candidate pairs differently than the others.
Table 15. Recommendation Field
Parameter Description
Text Keep going! It appears thatyour match groups employ a variety of tactics and are not redundant.