In this example, client duplicate records come from different sources, but Talend Data Stewardship determines initially which attributes of matched records to use to create the master record according to the survivorship rules defined when creating the campaign.
Data stewards can then review the tasks and manually modify survivorship rules per record attribute or enter completely new values to reach the most accurate and reliable master records.
Before you begin
A campaign owner has created the campaign and granted you access to it.
A campaign owner has assigned you tasks in the campaign.
- Log in to Talend Data Stewardship as a data steward.
On the TASKS page, click the campaign name,
Reconciling client data in this example, to open a list
of the tasks assigned to you.
The quality bar at the top of the list uses colors to give you a clear view about the quality of the data in each of the columns. Pointing to a color gives you details about the data values in the selected column.
Click a color in the quality bar to filter the data on which you want to work
and list the tasks which match the color indication:
Option Description Green represents valid data which matches the columns type. White represents empty fields. However, an empty value for a mandatory field is marked as red, not white. Red represents invalid data which does not match the column type or the parameter set in the data model.
- Click the down arrow on the top-left corner of the task list to expand all the tasks, or click the down arrow of a specific task to expand it.
Set survivorship rules to select attributes from customer records and use them
to build the master records. Several approaches are possible:
Set a survivorship rule manually for one attribute of multiple records.
- Click a column heading, Last_Name for example, and in the right-hand panel browse to the Survivorship section.
- Expand the Survivorship rule list and select Most common as the survivorship rule you want to apply to the name attribute in all the tasks in the list.
- If you want to apply the rule to all name values including null ones, clear the Avoid null values check box, otherwise leave it selected.
- Click Submit to select the most common name values and add them to the master records of all the tasks.
Set a survivorship rule manually for all attributes of one or multiple golden records.
- Select the tasks for which to set the rule, and under TASK in the right-hand panel click Apply survivorship rule.
- From the Selection list, click
You can apply the rule to all tasks or only to the filtered tasks if you have defined a filter on the list.
- From the Rule list, select to apply
Most trusted for example to the group of
If you have defined in the Merging campaign the sources of the duplicate data, the sources names are included in the list and can be selected as the survivorship rule to apply to the column values.
- If you want to apply the rule to all values including null ones, clear the Avoid null values check box, otherwise leave it selected.
- Click SUBMIT to add the name values with the highest score to the selected golden records.
Set a survivorship rule manually for one or several attributes of a record: point to an attribute in the master record of a task and from the icons which display, select the survivorship rule you want to apply.
: selects the first valid attribute value among the duplicates. "First" is defined by the order of the records when the task is created.
: selects the most common attribute value among the duplicates.
: selects the most recent attribute value among the duplicates.
: selects the most trusted attribute value among the duplicates.
Survivorship icons are grayed out when the survivorship rule is not applicable on the selected record.
- Select the value of a given source attribute to be the value for the master record: point to a source attribute and click the up arrow to set the selected value in the master record.
- Double-click the value in the master record and set a value of your choice which is not present in any of the sources.
- If the lock icon has a red background color, correct the invalid value in the task before you can mark it as ready to be validated.
- Repeat the above step to merge records and create master records for all the tasks assigned to you.
Click the icon next to the data record you modified to
mark the task as ready to be validated.
The first field is marked with green background and a percentage of the completion of your tasks is calculated and displayed in the top right corner.
You can remodify the records ready to be validated, but this puts the task back to its initial state with a dark-grey background color. You need to reclick the lock icon to mark the task as ready for validation.
- Click VALIDATE CHOICES in the top right corner to validate the changes and move the task from your list.
Master records are created and the records which are validated are moved to the list of the campaign participant who is granted the ACCOUNT VALIDATOR role in this example.