Handling merging tasks created by integrated matching - 8.0

Talend Data Stewardship Examples

Version
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Stewardship
Content
Data Governance > Assigning tasks
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Handling tasks
Last publication date
2024-04-15

Data stewards with access rights to the Merging campaign created automatically in Talend Data Stewardship need to access the merging tasks and manually merge duplicate records into one master record. This sends resolved data back to the Staging Area in Talend MDM Web UI.

In a Merging campaign, you can only modify values in the master fields, values in the source fields can not be modified.

Merging data values and validating your modifications transition the task to the Resolved state. You can not validate a task as long as it contains at least one invalid value.

In this example, duplicate records come from Talend MDM as a result of integrated matching processes used to validate customer records against a match rule.

Talend Data Stewardship determines initially which attributes to use to create the master record according to the survivorship rules created in Talend MDM and deployed on the server with the Merging campaign. However, you may need to manually modify survivorship rules per record attribute or enter completely new values to reach the most accurate and reliable master records.

Before you begin

  • An MDM administrator with the campaign owner role has validated the customer records in Talend MDM Web UI to deploy to Talend Data Stewardship the duplicates which need human intervention.
  • A campaign owner has granted you access to the Merging campaign.
  • A campaign owner has assigned you tasks in the campaign. Otherwise, you can assign the tasks to yourself.

Procedure

  1. Log in as a data steward.
  2. On the Tasks page, click the search icon on the top-right corner and enter tmdm to filter and list only the Merging campaigns created by integrated matching for which you have access rights.
  3. Click the campaign name to open the list of the tasks assigned to you.
  4. Use the quality bar on top of each of the columns to filter the data on which you want to work in the Chart or Pattern views in the right-hand panel.
  5. Click the down arrow on the top-left corner to expand all tasks in the list, or click the down arrow of a specific task to expand it.
  6. Set survivorship rules to select attributes from customer records and use them to build the master records. Several approaches are possible:
    • Set a survivorship rule manually for one or several attributes of a record: point to an attribute in the master record of a task and from the icons which display, select the survivorship rule you want to apply.

      • : Selects the first valid attribute value among the duplicates. "First" is defined by the order of the records when the task is created.

      • : Selects the most common attribute value among the duplicates.

      • : Selects the most recent attribute value among the duplicates.

      • : Selects the most trusted attribute value among the duplicates coming from different sources.

        Icons are grayed out when a rule is not applicable on the selected attribute.

    • Set a survivorship rule manually for one attribute of multiple records.

      1. Click a column heading and in the right-hand panel browse to the Survivorship section.
      2. Click the button and from the Survivorship rule list, select the rule you want to apply to all values in the selected column.
      3. Click Submit to select the most common name values and add them to the master records of the tasks.
    • Select the value of a given source attribute to be the value for the master record: point to a source attribute and click the up arrow to set the selected value in the master record.
  7. Repeat the above step to merge records and create master records for all the tasks assigned to you.
    If a given column has some values which need to be fixed, you can bulk transform them by using the functions listed in the right panel.
  8. Click the icon next to the data record you modified to mark the task as ready to be validated.
    The record is marked with green background and the lock icon is automatically moved to the next record. You can remodify the records ready to be validated, but this puts the task back to its initial state with a dark-grey background color. You need to reclick the lock icon to mark the task as ready for validation.

    If the lock icon has a red background color, you must first correct the invalid value in the task before being able to mark it as ready to be validated.

  9. Click Validate in the top-right corner of the page to validate the modifications you have done on the records.
    Master records are created and the records which are validated are moved from the list and marked as resolved.

Results

Approved tasks are transitioned to the Resolved state in the workflow and marked as resolved. Rejected tasks are transitioned back to the initial step in the workflow and marked as new.

What to do next

A data steward in Talend MDM Web UI needs to run the Staging validation again to take into account the change in the record status.