Matching two records - 6.5

Matching data

Version
6.5
Language
English (United States)
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Quality and Preparation > Matching data

You can use the tMatchGroup component to detect duplicates and define how to merge similar records to create a master record.

For more technologies supported by Talend, see Talend components.

Creating a master record is an iterative process: each new master record can be used to find new duplicates.

You can choose between two different algorithms to create master records:

  • Simple VSR Matcher

  • T-Swoosh

The main difference between the two algorithms is that T-Swoosh creates, for each master record, a new record that does not exist in the list of input records.