Setting rules and values for master records

Talend Data Stewardship Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Data Fabric
Talend Big Data Platform
Talend Real-Time Big Data Platform
Talend Big Data
Talend MDM Platform
Talend Data Integration
Talend Data Services Platform
Talend Data Management Platform
Talend ESB
task
Data Quality and Preparation > Reconciliating data
Data Governance > Managing campaigns
Data Quality and Preparation > Deduplicating data
EnrichPlatform
Talend Data Stewardship

In this example, client duplicate records come from different sources, but Talend Data Stewardship determines initially which attributes of matched records to use to create the master record according to the survivorship rules defined when creating the campaign.

Data stewards can then review the tasks and manually modify survivorship rules per record attribute or enter completely new values to reach the most accurate and reliable master records.

Before you begin

  • A campaign owner has created the campaign and granted you access to it.

    For further information, see Defining roles in the Merging campaign.

  • A campaign owner has assigned you tasks in the campaign.

  • You have accessed Talend Data Stewardship as a data steward.

Procedure

  1. In the MY TASKS page, click the campaign name, RECONCILING CIENT DATA in this example, to open a list of the tasks assigned to you.
    The quality bar at the top of the list uses colors to give you a clear view about the quality of the data in each of the columns. Pointing to a color gives you details about the data values in the selected column.
  2. To filter the data on which you want to work, click a color in the quality bar on top of a column to list the tasks which match the color indication:
    Option Description
    Green represents valid data which matches the columns type.
    White represents empty fields. However, an empty value for a mandatory field is marked as red, not white.
    Red represents invalid data which does not match the column type or the parameter set in the data model.
  3. Click the down arrow on the top-left corner of the task list to expand all the tasks, or click the down arrow of a specific task to expand it.
  4. Set survivorship rules to select attributes from customer records and use them to build the master records. Several approaches are possible:
    • Set a survivorship rule manually for one or several attributes of a record: point to an attribute in the master record of a task and from the icons which display, select the survivorship rule you want to apply.

      • : selects the first valid attribute value among the duplicates. "First" is defined by the order of the records when the task is created.

      • : selects the most common attribute value among the duplicates.

      • : selects the most recent attribute value among the duplicates.

      • : selects the most trusted attribute value among the duplicates.

        Survivorship icons are grayed out when the survivorship rule is not applicable on the selected record.

    • Set a survivorship rule manually for one attribute of multiple records.

      1. Click a column heading, Last_Name for example, and in the right-hand panel browse to the Survivorship section.
      2. Click the button and from the Survivorship rule list, select Most common as the survivorship rule you want to apply to the name attribute in all the tasks in the list.
      3. Click Submit to select the most common name values and add them to the master records of all the tasks.
    • Select the value of a given source attribute to be the value for the master record: point to a source attribute and click the up arrow to set the selected value in the master record.
  5. Double-click the value in the master record and set a value of your choice which is not present in any of the sources.
  6. If the lock icon has a red background color, correct the invalid value in the task before you can mark it as ready to be validated.
  7. Repeat the above step to merge records and create master records for all the tasks assigned to you.
  8. Click the icon next to the data record you modified to mark the task as ready to be validated.
    The first field is marked with green background and a percentage of the completion of your tasks is calculated and displayed in the top right corner.

    You can remodify the records ready to be validated, but this puts the task back to its initial state with a dark-grey background color. You need to reclick the lock icon to mark the task as ready for validation.

  9. Click VALIDATE CHOICES in the top right corner to validate the changes and move the task from your list.

Results

Master records are created and the records which are validated are moved to the list of the campaign participant who is granted the ACCOUNT VALIDATOR role in this example.