Using deduplication components - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Some data quality components enable you to analyze columns in databases and group duplicates or match values together using matching rules or comparison algorithms. Example components are tMatchGroup, tRecordMatching, tGenKey, and tRuleSurvivorship.

For further information about managing a survivorship rule package, see Managing a survivorship rule package.

For further information and example Jobs about the deduplication components, see the data quality chapter in the Talend Components Reference Guide and Cleansing delimited files (csv files).

The data quality demo project has also ready-to-use Jobs that may use deduplication components. For further information, see Importing a data quality demo project.