How to create tasks automatically - 6.5

Talend Data Stewardship Console User Guide

Talend Data Fabric
Talend MDM Platform
Talend MDM Web UI
Data Governance
Data Quality and Preparation

The tStewardshipTaskOutput component allows you to create either resolution or data tasks and list them in the stewardship console database.

When you use this component in a data matching Job, all data resolution tasks corresponding to the existing data conflicts are listed in the stewardship console waiting to be resolved. An authorized data steward can then intervene in order to merge/resolve this data coming from heterogeneous sources.

When you use this component in a data integration Job, data tasks are also listed in the stewardship console waiting for a stewards intervention to insure that data is consistent and complete.

Below is an example of a possible lifecycle of the data resolution tasks listed in Talend Data Stewardship Console:

When multiple sources exist for the same data records and those sources conflict, you must choose one source, or merge data from these different sources to reach the valid records. A matching for these records is necessary. Where a match is not found, problematic data is listed in Talend Data Stewardship Console where a manual composite matching through human intervention is required. This composite matching will resolve the conflicts and reach the final set of data that will be placed in the MDM hub if it is master data or in any other database/application/file.

This overall flow is translated in one or more Talend Jobs that will gather data from multiple sources, use the tMatchGroup component to match data. Where a match is not find, a tStewardshipTaskOutput will list the data in Talend Data Stewardship Console. An authorized steward can then do the composite matching on the listed data to reach the final set of data that will be written in the stewardship console database.

Consider as an example that a Talend Job using the tMatchGroup component has been designed and executed in the Integration perspective in Talend Studio to match customer records coming from three different sources: a SAP system, an Oracle database and an excel file. Exact and possible matches have been found in the data.

In this Job:

  • a tUnite component merges the data coming from the three sources,

  • a tGenkey component generates a functional key for each input column,

  • a tMatchGroup component compares the columns in the input flow by using a defined matching method and groups similar encountered duplicates together,

  • a tMap component filters the matching results and sends the unique records to a tMDMOutput component to write them in the MDM hub,

  • a tSurviveFields component receives the flow from tMap and merges it based on one or more columns to have the golden records that are sent to the MDM hub via a tMDMOutput,

  • another tMDMOutput that writes the unique records directly in the MDM hub,

  • a tStewardshipTaskOutput component creates the data resolution task that details all the match and possible match record values, that could not be resolved automatically, and lists them in Talend Data Stewardship Console.

An authorized steward will then intervene to manually track/merge/resolve the tasks listed in the task list in order to composite the valid data records and store them in the stewardship console database. Another Talend Job using a tStewardshipTaskInput and an output component is used to write the data in the MDM hub or in any MDM target application. The tStewardshipTaskInput component will read resolved tasks from the stewardship console database and send the records stored in the tasks to the output component that will populate the MDM hub or any other MDM target application with these data records.

For further information about the Talend components necessary to design such a Job, see the Talend Components Reference Guide, especially the data quality and MDM component chapters.

For an example scenario, see the scenario of tStewardshipTaskOutput in the Talend Components Reference Guide.