Reconciling data coming from different sources - Cloud

Talend Cloud Data Stewardship Getting Started Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Stewardship
Content
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Deduplicating data
Data Quality and Preparation > Handling tasks
Last publication date
2024-03-05
One of the solutions provided by Talend Cloud Data Stewardship is to match, cleanse and master data using a Merging campaign.

This use case describes how you can match and cleanse data coming from different sources in order to build master records.

Let's suppose that you are facing data quality and anomalies issues in your customer data. You have found duplicates lead information due to lack of synchronization between the different CRMs used in your enterprise. A Merging campaign enables you to solve the duplicates by surviving only the appropriate data.

However, you must consider two aspects:
  • How do you identify the match groups which group potentially duplicate records together? This question is resolved by using a Talend Job in Talend Studio.
  • How do you pick the best attribute values from the data sources and presents the most accurate and reliable master records for consumptions by users and systems? This issue is resolved by using a Merging campaign in Talend Data Stewardship.

To replicate the example and use the exact client data, we assume that a campaign owner has downloaded the input file and the Talend Job used in this example. They can be used to load tasks in the campaign once it is created.

Retrieve the tds_gettingstarted_source_files.zip and configure the connection to Talend Cloud Data Stewardship in Talend Studio.