tStewardshipTaskOutput properties - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Component family

Talend MDM

 

Basic settings

Schema and Edit Schema

A schema is a row description, it defines the number of fields that will be processed and passed on to the next component. The schema is either built-in or remote in the Repository.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: You create the schema and store it locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.

 

Url

Enter the appropriate URL to access the Talend Data Stewardship Console application.

For more information about the URL settings, see How to set the URL to access Talend Data Stewardship Console.

 

Username and Password

Type in the user authentication data for the stewardship console database.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Task name

Type in a name for the task you want to list in the Talend Data Stewardship Console.

 

Type

Select the type of the tasks you want to write:

Resolution:data resolution tasks represent the results of the data matching processes done on data across heterogeneous sources.

Data: data integrity tasks are the results of the data integrity processes done on data.

For further information on task types and task management, see Talend Data Stewardship Console User Guide.

 

Created by

Type in the name of the task creator.

Note

The task creators correspond to the users of Talend MDM Web User Interface. For further information, see Talend MDM Web User Interface User Guide.

 

Owner

Type in the name of the task owner.

Note

The task owners correspond to the users of Talend MDM Web User Interface. For further information, see Talend MDM Web User Interface User Guide.

 

Star

Type in a number, 0 through 5, that you want to assign to the tasks as a numerical rating, in the form of stars, to highlight importance.

 

Tag

Type in the name of the tag category you want to associate with the tasks you want to write.

Warning

The tag categories must have been created in the stewardship console beforehand. For further information about how to create tag categories, see Talend Data Stewardship Console User Guide.

Note

Only resolution task

Looping column

Select a column in the input schema on which to base the loop. Whenever the looping column value changes, the component will close the previous element (task) and open a new one (new task).

Note

The looping column is typically the group id generated by the tMatchGroup component. For further information, see tMatchGroup.

 

Source/Target selector

Select a column in the input schema that will decide if the task records defined according to the looping column will be a target record or a source record.

 

Source

Select a column in the input schema.

Note

Only resolution task

Score

Select the matching score column in the input schema.

Note

Only resolution task

Weights

Select the column that defines the matching distance for each column in the input schema.

 

Extra info

If required, use the plus button to add one or more rows for any extra information you want to add to any of the source records.

In the Title column, enter the information key.

In the Message column, enter the information you want to add. In the Column column, click in the added row and select the source column to which you want to add the extra information.

The data steward will be able to see this added information any time he/she places the pointer on the source record column in the Talend Data Stewardship Console. This information will help him/her making a more informed decision when resolving the task.

 

Record column

Use the plus button to add as many rows as needed and then click in each of the rows and select the columns in the input schema that will form the target record.

 

Max tasks per commit

Define the maximum number of the tasks per commit.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Use this component to write data records held in tasks. This component must have an input flow.

If a Job has too many tasks to be handled with the Talend Data Stewardship Console application, you are recommended to increase the timeout values before executing the Job.

You can customize two timeout values as needed:

  • -Dtaskload_connect_timeout which specifies the timeout value for connecting to the Talend Data Stewardship Console application;

  • -Dtaskload_read_timeout which specifies the timeout value for reading the Talend Data Stewardship Console application.

By default, their values are both 50,000 in milliseconds.

To increase the timeout values, do the following:

  1. In the Run view, select the Advanced settings tab.

  2. In the JVM Settings area of the tab view, select the Use specific JVM arguments check box to activate the Argument table.

  3. Next to the Argument table, click the New... button to open the [Set the VM Argument] dialog box.

  4. In the dialog box, enter the timeout value in milliseconds. For example, -Dtaskload_connect_timeout=60000.

  5. Click OK to close the dialog box.

  6. Repeat the steps above to set another timeout value in milliseconds. For example, -Dtaskload_read_timeout=60000.

    For further information about how to apply the JVM argument for all of the Job executions, see Talend Studio User Guide.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.