tStewardshipTaskInput - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Warning

This component is available in the Palette of Talend Studio but you will only be able to use it on the condition that you have subscribed to the relevant Talend Platform product.

tStewardshipTaskInput properties

Component family

Talend MDM

 

Function

tStewardshipTaskInput reads data sets stored in the stewardship console database in the form of tasks. This component can retrieve tasks according to certain search criteria or without any search criteria, on the condition that the output schema is the same for all the retrieved tasks.

Note

In order to better understand the purpose of this component, check the Talend Data Stewardship Console User Guide.

Purpose

This component reads data in the datastewardship console and thus makes it possible to process this data, to use any Talend output component and write the data retrieved from the stewardship database into the target application or database.

Basic Settings

Schema and Edit Schema

A schema is a row description, it defines the number of fields that will be processed and passed on to the next component. The schema is either built-in or remote in the Repository.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema will be created and stored for this component only. Related Topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the repository. You can reuse it in various projects and jobs. Related Topic: see Talend Studio User Guide.

 

Url

Enter the URL to access the Talend Data Stewardship Console application.

For more information about the URL settings, see How to set the URL to access Talend Data Stewardship Console.

 

User name and Password

Type in the authentication information to the MDM server.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Type

If required, select the type of the tasks you want to read:

Resolution: data resolution tasks represent the results of the data matching processes done on data across heterogeneous sources.

Data: data integrity tasks are the results of the data integrity processes done on data.

For further information on task types and task management, see Talend Data Stewardship Console User Guide.

 

Owner

If required, type in the name of the task owner. This will filter the tasks according to the task owner.

 

Tag

If required, type in the name of the tag category associated with the tasks you want to read.

For further information, see Talend Data Stewardship Console User Guide.

 

Start Date/End Date

If required, set a task creation date range within which you want to read the tasks. Follow the following format: yyyy-mm-dd hh:mm:ss.

 

Status

If required, select from the list the task status according to which you want to filter the retrieved tasks.

 

Star ranking

If required, select the number of stars, 0 through 5, assigned to the tasks as a numerical rating to highlight importance. This will filter the tasks according to their importance.

 

Limit

If required, set the maximum number of tasks to be retrieved. If Limit = 0, no task is read.

 

Target record only

(Selected by default)

When this check box is selected, the component retrieves from the task only the target record. When it is unchecked, the component retrieves from the task the source record(s) in addition to the target record.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Use this component as a start component. It needs an output flow.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

 

Scenario: Reading data in the stewardship console database

This scenario describes a two-component Job that reads data sets (data records) from the tasks stored in the database of Talend Data Stewardship Console according to the criteria you define in the Basic settings view of the tStewardshipTaskInput component.

Then you can use any Talend output component to write the data retrieved from the stewardship database into the target application or database.

In this example, the filtered data is fetched and displayed in the log console.

  • Drop the tStewardshipTaskInput and tLogRow components from the Palette onto the design workspace.

  • Connect the two components together using a Row Main link.

  • Double-click tStewardshipTaskInput to open the Basic settings view and define the component properties.

  • In the Schema list, select Built-In and click the three-dot button next to Edit schema to open a dialog box.

    Here you can define the structure of the data you want to read on the Talend Data Stewardship Console database.

Note

The default schema columns in the schema dialog box vary according to whether the the Target record only check box is selected or not.

If the Target record only check box is selected, the by-default schema looks like the following:

If the Target record only check box is not selected, the by-default schema has two extra columns: TARGET and SOURCE.

The TARGET column will indicate which data set is the target record in each of the tasks in the database. The SOURCE column will indicate the name of the source application for each source record in the tasks.

In this scenario, data is collected from the three defined input columns: Firstname, Lastname and DOB and all the by-default columns.

  • Click OK to close the dialog box and proceed to the next step.

  • In the Url field, enter the URL for connecting to the stewardship console database.

  • In the Username and Password fields, enter your login and password to connect to the MDM server.

  • From the Type list, select the type of the tasks from which you want to retrieve data record: Resolution or Data. In this example, you want retrieve data only from resolution tasks.

    For further information on task type, see Talend Data Stewardship Console User Guide.

  • In the Owner field, enter between inverted commas the name of the task owner, the user to whom the task is assigned, Administrator in this example.

Note

Task can be assigned to a specific user either from the Basic settings view of the tStewardshipTaskOutput component, or directly from the stewardship console by an administrator. For further information, see tStewardshipTaskOutput.

  • In the Tag field, enter between inverted commas the name of the tag category associated with the tasks you want to read, not used in this example.

    For further information, see Talend Data Stewardship Console User Guide.

  • In the Start Date and End Date fields, enter between inverted commas a task creation date range within which you want to read the tasks, not used in this example.

  • In the Status field, select the task status to decide from what tasks you want to retrieve data. In this example, you want to retrieve data only from resolved tasks.

  • In the Star ranking field, select from the list the number of stars, 0 through 5, assigned to the tasks in the stewardship console. This will enable you to filter the tasks from which you want to retrieve data by the star rate assigned to them.

    In this example, select 2 from the list. Data will be retrieved from all tasks that have been assigned a star rate up to 2.

Note

If you select All from the list, you will retrieve data from all tasks regardless of the star ranking assigned to each of them.

  • In the Limit field, enter a number to limit the tasks to retrieve from the stewardship database.

  • Leave the Target record only check box selected in order to retrieve from the tasks only the target record.

  • In the design workspace, double-click the tLogRow component to display its Basic settings view and set the component properties.

  • Click Edit Schema to open the schema dialog box and ensure that the schema has been collected from the previous component. If not, click Sync Columns.

  • Save the Job and press F6 to execute it.

The tStewardshipTaskInput component has retrieved from the stewardship console database the target data records from all resolved tasks that have been assigned a star rate from 0 through 2. The output schema is the same for all the retrieved records.

  • Clear the Target record only check box in order to retrieve from the tasks all source and target records.

    This will retrieve more than one row for each task.

  • In the Basic settings view of the tLogRow component, click Sync columns to synchronize the schema between the input link (that have two extra columns now) and the tLogRow component.

  • Save the Job and press F6 to execute it.

    The tStewardshipTaskInput component retrieves from the stewardship console database both the target and source data records from all resolved tasks that have been assigned a star rate of 0 through 2.

The above capture shows an example of the retrieved data from one of the tasks in the stewardship console database. Three rows have been outputted for this task: the target record, where TARGET = true, and two input records where TARGET = false and SOURCE = CRM.