Surviving master records - Cloud - 8.0

Data matching with Talend tools

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Data matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Data Governance > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Continuous matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Data matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Data Quality and Preparation > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Design and Development > Third-party systems > Data Quality components > Matching components > Continuous matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Data matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Fuzzy matching components
Design and Development > Third-party systems > Data Quality components > Matching components > Matching with machine learning components
Last publication date
2024-02-06
You can use the tRuleSurvivorship component or Talend Data Stewardship to survive master records.

Merging records using tRuleSurvivorship

Once you estimated duplicates and possible duplicates that are grouped together, you can use the tRuleSurvivorship component to create a single representation for each group of duplicates using the best-of-breed data. This representation is called a survivor.

For an example of how to create a clean data set from the suspect pairs labeled by tMatchPredict and the unique rows computed by tMatchPairing, see Matching with machine learning.

Using Talend Data Stewardship for clerical review and merging records

You can add merging campaigns in Talend Data Stewardship to review and modify survivorship rules, create master records and merge data.

For further information on merging campaigns in Talend Data Stewardship, see Talend Data Stewardship Examples.

In Talend Data Stewardship, data stewards are business users in charge of resolving data stewardship tasks:
  • Classifying data by assigning a label chosen among a predefined list of arbitration choices.
  • Merging several potential duplicate records into one single record.

    Merging tasks allow authorized data stewards to merge several potential duplicate source records into one single record (golden record). The outcome of a merging task is the golden record produced by data stewards.

    For further information on merging tasks in Talend Data Stewardship, see Talend Data Stewardship Examples.

    Source records can come from the same source (database deduplication) or different sources (databases reconciliation).