tMatchGroup - 7.0

Data matching

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Matching components > Data matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Data matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Data matching components
EnrichPlatform
Talend Studio

Creates groups of similar data records in any source data including large volumes of data by using one or several match rules.

tMatchGroup compares columns in both standard input data flows and in M/R input data flows by using matching methods and groups similar encountered duplicates together.

Several tMatchGroup components can be used sequentially to match data against different blocking keys. This will refine the groups received by each of the tMatchGroup components through creating different data partitions that overlap previous data blocks and so on.

In defining a group, the first processed record of each group is the master record of the group. The other records are computed as to their distances from the master records and then are distributed to the due master record accordingly.

For more technologies supported by Talend, see Talend components.