Matching two records - 6.5

Using tMatchGroup with the Simple VSR Matcher and T-Swoosh algorithms

author
Talend Documentation Team
EnrichVersion
6.5
task
Data Governance > Third-party systems > Data Quality components > Matching components
Data Quality and Preparation > Matching data
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components
Design and Development > Third-party systems > Data Quality components > Matching components
EnrichPlatform
Talend Studio

You can use the tMatchGroup component to detect duplicates and define how to merge similar records to create a master record.

Creating a master record is an iterative process: each new master record can be used to find new duplicates.

You can choose between two different algorithms to create master records:

  • Simple VSR Matcher

  • T-Swoosh

The main difference between the two algorithms is that T-Swoosh creates, for each master record, a new record that does not exist in the list of input records.