Matching two records - 6.5

Matching data

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Matching data
EnrichPlatform
Talend Studio

You can use the tMatchGroup component to detect duplicates and define how to merge similar records to create a master record.

For more technologies supported by Talend, see Talend components.

Creating a master record is an iterative process: each new master record can be used to find new duplicates.

You can choose between two different algorithms to create master records:

  • Simple VSR Matcher

  • T-Swoosh

The main difference between the two algorithms is that T-Swoosh creates, for each master record, a new record that does not exist in the list of input records.