Using deduplication components - 6.2

Talend MDM Platform Studio User Guide

EnrichVersion
6.2
EnrichProdName
Talend MDM Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Some data quality components enable you to analyze columns in databases and group duplicates or match values together using matching rules or comparison algorithms. Example components are tMatchGroup, tMatchGroupHadoop, tRecordMatching, tGenKey, tSurviveField and tRuleSurvivorship.

For further information about managing a survivorship rule package, see Managing a survivorship rule package.

For further information and example Jobs about the deduplication components, see the data quality chapter in the Talend Components Reference Guide and Cleansing delimited files (csv files).

Note

The data quality demo project has also ready-to-use Jobs that may use deduplication components. For further information, see Importing a data quality demo project.