Search modes for Index rules - 7.3

Standardization

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Standardization components
Data Quality and Preparation > Third-party systems > Data Quality components > Standardization components
Design and Development > Third-party systems > Data Quality components > Standardization components
Last publication date
2024-02-21

One type of the advanced rules used by the tStandardizeRow component is Index rules. Index rules use synonym indexes as a reference to search for match data.

Using an Index rule without having the possibility to specify what type of match (exact, partial, fuzzy, etc.) you want to use on the input flow will not standardize and output the data you expect. tStandardizeRow allows you to select one of the following search modes for each Index rule you define in the component:

Search mode

Description

Match all

each word of the input string must exist in the index string, but the index string may contain other words too.

Match all fuzzy

each word of the input string must match similar words of the index string.

Match any

the input string should have at least one word that matches a word of the index string.

Match any fuzzy

the input string should have at least one word that is similar to a word in the index string.

Match exact

the exact input string should match the exact index string.

Match partial

each word of the input string must exist in the index string but the input string may contain other words too up to a given limit, 1 by default. This means that one word of the input string may not match to any word of the index string

Suppose, for example, that you have the below record in the input flow:
DULUX PAINTPOD EXTRA REACH HANDLE

And you have created a color index that has the Extra Deep Base string.

If you define an Index rule in tStandardizeRow and set the search mode to Match any, the component will return Extra Deep Base as a color for the above record because there is the Extra word that matches the index string. But if you want the component to only return a match when the exact search string is found in the index, you set the search mode of the rule to Match exact and the component will not return a color for the record.

For a Job example, see Extracting exact match by using Index rules.