Performing integrated matching tasks in the Staging Area - 6.5

Talend MDM Web User Interface User Guide

EnrichVersion
6.5
EnrichProdName
Talend Data Fabric
Talend MDM Platform
Talend Open Studio for MDM
task
Data Governance
EnrichPlatform
Talend MDM Web UI

As part of the validation process for records in the Staging Area, you can also perform a series of integrated matching tasks which group together similar records in order to create a "golden" record, that is to say a consolidated version of all the records in the group.

Using Talend Data Stewardship Console during the integrated matching process

The Talend Data Stewardship Console is deprecated since Talend 6.4. Consider migrating to Talend Data Stewardship.

If Talend Data Stewardship Console is used to handle the tasks generated during the integrated matching process, the integrated matching process can be broken down into the following sub-tasks:

  1. Identify similar records and decide which values to survive (a process known as "match and survivorship").

  2. Build golden records.

  3. Create or update a task in Talend Data Stewardship Console for each golden record.

  4. Process any changes to the record following the Talend Data Stewardship Console task.

  5. Insert or update records in the master database.

MDM performs each of these tasks automatically, based on criteria you have set. Human intervention is only required to act on the task created in Talend Data Stewardship Console for each golden record, and only for match groups below the confident match threshold.

Note

If a group has only one staging record, the Talend Data Stewardship Console task will not be created.

Using Talend Data Stewardship during the integrated matching process

If Talend Data Stewardship is used to handle the tasks generated during the integrated matching process, the integrated matching process can be broken down into the following sub-tasks:

  1. Identify similar records and decide which values to survive (a process known as "match and survivorship").

  2. Build golden records.

  3. Create or update a task in Talend Data Stewardship for each golden record.

  4. Process any changes to the record following the Talend Data Stewardship task.

  5. Insert or update records in the master database.

MDM performs each of these tasks automatically, based on criteria you have set. Human intervention is only required to act on the task created in Talend Data Stewardship for each golden record, and only for match groups below the confident match threshold.

If a master data record comes from a golden record, any updates made on the master data record will be synchronized to the golden record in the staging area automatically only if the current status of the golden record is still 205. Note that the associated task will remain unchanged.

Note

If the status of a golden record is 205 after the match and survivorship process, no Talend Data Stewardship task will be created.

Viewing the results of the match and survivorship process in the Staging Data browser

If the records which are pending validation in the Staging Area come from a data model that has a Match Rule attached to it, MDM applies this Match Rule to the records when you launch the validation process, in order to check for matches and decide which values to use for the golden record. This is the match and survivorship process.

During the match and survivorship process, the records pending validation will be matched against the golden record first in an existing match group, and the records that do not match the golden record will be matched against the source record(s) in the match group.

For more information on creating Match Rules and attaching them to a data model, see the Talend Studio User Guide.

The results of the match and survivorship process can be seen in the Staging Data Browser.

The Match group column shows the identifier assigned to each group of matches by MDM.

The Status column shows the results of the process.

Status

Description

000

New or modified record

202

Record successfully identified as being part of a group of matching records

203

Record successfully identified as being part of a group of matching records but the automatic survivorship could not make a trusted golden record because the confidence in the golden record is too low

204

Record successfully identified as being part of a group of matching records and this record is the unique (golden) record that is submitted to the master database

205

Record successfully passed the MDM validation phase and also exists in the master database

206

Record was deleted

403

Record failed to pass the MDM validation phase, due to a validation issue against the user data model

404

Record failed to pass the MDM validation phase, due to a constraint issue

Depending on the criteria defined in the Match Rule, some records may be identified as potential matches that require manual intervention from a data steward to decide if they are true matches or not.

Next time the Staging validation is run, the process looks at how the associated tasks were dealt with and updates the status of the record accordingly. If a task is resolved, for instance, the status of the record changes from 203 to 204 and the record is submitted to the master storage immediately, before ending up with a 205 or 40x status at the end.

If a record is deleted (status 206), it is removed from the match group to which it belongs and the survivorship process is run again.

After you resolve one or more tasks, you have to run the Staging validation again to take into account the record status changes.

Note

If Talend Data Stewardship is used to handle tasks generated during the integrated matching process, as long as you made some updates (for example, adding a new source record to a match group, or updating an element of a source record) which lead to the status of a golden record changed from 205 to 203, the associated Talend Data Stewardship task will be reopened, and the updates will be synchronized to the task as well.