Data stewards with access rights to the Merging campaign created automatically in Talend Data Stewardship need to access the merging tasks and manually merge duplicate
records into one master record. This sends resolved data back to the Staging Area in
Talend MDM Web UI.
In a Merging campaign, you can only modify values in the master
fields, values in the source fields can not be modified.
Merging data values and validating your modifications transition the task to the Resolved
state. You can not validate a task as long as it contains at least one invalid
value.
In this example, duplicate records come from Talend MDM as a result of integrated
matching processes used to validate customer records against a match rule.
Talend Data Stewardship determines initially which attributes to use
to create the master record according to the survivorship rules created in
Talend MDM and deployed on the server with the Merging
campaign. However, you may need to manually modify survivorship rules per
record attribute or enter completely new values to reach the most accurate
and reliable master records.
Procedure
-
Log in as a data steward.
-
On the Tasks page, click
the search icon on the top-right corner and enter tmdm to filter and list only the Merging campaigns created by integrated matching for which you
have access rights.
-
Click the campaign name to open the list of the tasks assigned to you.
-
Use the quality bar on top of each of the columns to filter the data on which
you want to work in the Chart or
Pattern views in the right-hand panel.
-
Click the down arrow on the top-left corner to expand all tasks in the list, or
click the down arrow of a specific task to expand it.
-
Set survivorship rules to select attributes from customer
records and use them to build the master records. Several approaches are
possible:
-
Set a survivorship rule manually for one or several
attributes of a record: point to an attribute in the master record of a
task and from the icons which display, select the survivorship rule you
want to apply.
-
: Selects the first valid
attribute value among the duplicates. "First" is defined by the
order of the records when the task is created.
-
: Selects the most common
attribute value among the duplicates.
-
: Selects the most recent
attribute value among the duplicates.
-
: Selects the most trusted
attribute value among the duplicates coming from different
sources.
Icons are grayed out when a rule is not
applicable on the selected attribute.
-
Set a survivorship rule manually for one attribute of
multiple records.
- Click a column heading and in the right-hand panel
browse to the Survivorship
section.
- Click the button and from the
Survivorship rule list,
select the rule you want to apply to all values in the selected
column.
- Click Submit
to select the most common name values and add them to the master
records of the tasks.
- Select the value of a given source attribute to be the
value for the master record: point to a source attribute and click the up
arrow to set the selected value in the master record.
-
Repeat the above step to merge records and create master records for all the
tasks assigned to you.
If a given column has some values which need to be fixed, you can bulk
transform them by using the functions listed in the right panel.
-
Click the icon next to the data record you modified to mark the task as ready to be
validated.
The record is marked with green background and the lock icon is automatically
moved to the next record. You can remodify the records ready to be validated,
but this puts the task back to its initial state with a dark-grey background
color. You need to reclick the lock icon to mark the task as ready for
validation.
If the lock icon has a red background color, you must first
correct the invalid value in the task before being able to mark it as ready to
be validated.
-
Click Validate in the top-right corner of the
page to validate the modifications you have done on the records.
Master records are created and the records which are validated are moved
from the list and marked as resolved.
Results
Approved tasks are transitioned to the Resolved state in the workflow and marked as
resolved. Rejected tasks are transitioned back to the initial step in the workflow
and marked as new.
What to do next
A data steward in Talend MDM Web UI needs
to run the Staging validation again to take into account the change in the record
status.