Matching two records - 6.4

Matching

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Quality components > Matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components
Design and Development > Third-party systems > Data Quality components > Matching components
EnrichPlatform
Talend Studio

You can use the tMatchGroup component to detect duplicates and define how to merge similar records to create a master record.

For more technologies supported by Talend, see Talend components.

Creating a master record is an iterative process: each new master record can be used to find new duplicates.

You can choose between two different algorithms to create master records:

  • Simple VSR Matcher

  • T-Swoosh

The main difference between the two algorithms is that T-Swoosh creates, for each master record, a new record that does not exist in the list of input records.