In this example, client duplicate records come from different sources, but
Talend Data Stewardship
determines initially which attributes of matched records to use to create the master
record according to the survivorship rules defined when creating the campaign.
Data stewards can then review the tasks and manually modify survivorship rules per record
attribute or enter completely new values to reach the most accurate and reliable master
records.
Before you begin
-
A campaign owner has created the campaign and granted
you access to it.
For further information, see Defining roles in the campaign.
-
A campaign owner has assigned you tasks in the campaign.
- You have accessed Talend Data Stewardship as a data
steward.
Procedure
- In the MY TASKS page, click the campaign name,
RECONCILING CIENT DATA in this example, to open a list of
the tasks assigned to you.
The quality bar at the top of the list uses colors to give you a clear view
about the quality of the data in each of the columns. Pointing to a color gives
you details about the data values in the selected column.
- To filter the data on which you want to work, click a color in the quality bar
on top of a column to list the tasks which match the color indication:
Option |
Description |
Green |
represents valid data which matches the columns type. |
White |
represents empty fields. However, an empty value for a mandatory
field is marked as red, not white. |
Red |
represents invalid data which does not match the column type or
the parameter set in the data model. |
- Click the down arrow on the top-left corner of the task list to expand all the
tasks, or click the down arrow of a specific task to expand it.
- Set survivorship rules to select attributes from customer records and use them
to build the master records. Several approaches are possible:
-
Set a survivorship rule manually for one or several attributes of a
record: point to an attribute in the master record of a task and from
the icons which display, select the survivorship rule you want to
apply.
-
: selects the most common
attribute value among the duplicates.
-
: selects the most recent
attribute value among the duplicates.
-
: selects the most trusted
attribute value among the duplicates.
Survivorship icons are grayed out when the survivorship rule is
not applicable on the selected record.
-
Set a survivorship rule manually for one attribute of multiple
records.
- Click a column heading, Last_Name for
example, to display the Survivorship panel to
the right of the task list.
- In the panel to the right, scroll down to the Apply
survivorship rule menu and click the
button. From the
Rule list, select Most
common as the survivorship rule you want to apply to the
name attribute in all the tasks in the list.
- Click Submit to select the most common name
values and add them to the master records of all the tasks.
- Select the value of a given source attribute to be the value for the
master record: point to a source attribute and click the up arrow to set the
selected value in the master record.
- Double-click the value in the master record and set a value of your choice
which is not present in any of the sources.
- If the lock icon has a red background color, correct the invalid value in the
task before you can mark it as ready to be validated.
- Repeat the above step to merge records and create master records for all the
tasks assigned to you.
- Click the
icon next to the data record you modified to
mark the task as ready to be validated.The first field is marked with green background and a percentage of the
completion of your tasks is calculated and displayed in the top right
corner.
You can remodify the records ready to be validated, but this puts the
task back to its initial state with a dark-grey background color. You need to
reclick the lock icon to mark the task as ready for validation.
- Click VALIDATE CHOICES in the top right corner to
validate the changes and move the task from your list.
Results
Master records are created and the records which are validated are moved to the list
of the campaign participant who is granted the ACCOUNT
VALIDATOR role in this example.