tMDMInput - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tMDMInput properties

Component family

Talend MDM

 

Function

tMDMInput reads master data in the MDM Hub.

Purpose

This component reads master data in an MDM Hub and thus makes it possible to process this data.

Basic Settings

Property Type

Either Built in or Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

 

 

Built-in: No property data stored centrally

 

 

Repository: Select the repository file where properties are stored. The fields that follow are completed automatically using the fetched data

 

Schema and Edit Schema

A schema is a row description, it defines the number of fields that will be processed and passed on to the next component. The schema is either built-in or remote in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

 

 

Built-in: The schema will be created and stored for this component only. Related Topic: see Talend Studio User Guide.

 

 

Repository: The schema already exists and is stored in the repository. You can reuse it in various projects and jobs. Related Topic: see Talend Studio User Guide.

 Use an existing connectionSelect this check box if you want to use a configured tMDMConnection component.
 

MDM version

By default, Server 6.0 is selected. Although it is recommended to migrate existing jobs for this new version, the Server 5.6 option is available to ease the process of the migration of your Jobs so as to keep them working without modification with a 6.0 server. To do so, an option on the server must be enabled to accept and translate requests from such Jobs.

 

URL

Type in the URL to access the MDM server.

 

Username and Password

Type in user authentication data for the MDM server.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Entity

Type in the name of the business entity that holds the master data you want to read.

 

Data Container

Type in the name of the data container that holds the master data you want to read.

Type

Select Master or Staging to specify the database on which the action should be performed.

 

Use multiple conditions

Select this check box to filter the master data using certain conditions.

Xpath: Enter between quotes the path and the XML node to which you want to apply the condition.

Function: Select the condition to be used from the list. Depending on the type of field pointed to by the XPath, only certain operators may apply; for instance, if the field is a boolean only the Equal or Not Equal operators are appropriate.

The following operators are available:

  • Contains: Returns a result which contains the word or words entered.

  • Joins With: This operator is reserved for future use.

  • Starts With: Returns a result which begins with the string entered.

  • Strict Contains: Returns a result which contains the exact regular expression entered. Applies only to XML databases.

  • Equal: Returns a result which matches the boolean entered; that is, True or False.

  • Not Equal: Returns a result of any value other than the boolean entered; that is, True or False.

  • is greater than: Returns a result which is greater than the numerical value entered. Applies to number fields only.

  • is greater or equal: Returns a result which is greater than or equal to the numerical value entered. Applies to number fields only.

  • is lower than: Returns a result which is less than the numerical value entered. Applies to number fields only.

  • is lower or equal: Returns a result which is less than or equal to the numerical value entered. Applies to number fields only.

  • whole content contains: Performs a plain text search in all the fields of the entity. For SQL databases, a "Starts with" search is performed; for XML databases, a "Contains" search is performed.

  • contains a word like: Performs a fuzzy search to return a similar word to the word entered.

  • is empty or null: Returns a result where the field is empty or returns a null value.

Value: Enter between inverted commas the value you want to use. Note that if the value contains XML special characters such as /, you must also enter the value in single quotes ("'ABC/XYZ'") or the value will be considered as an XPath.

Predicate: Select a predicate if you use more than one condition.

The following predicates are available:

  • Default: Interpreted as an and.

  • or: One of the conditions applies.

  • and: Both or all of the conditions apply.

The other predicates are reserved for future use and may be subject to unpredictable behavior.

If you clear this check box, you have the option of selecting particular IDs to be displayed in the ID value column of the IDS table.

Note

If you clear the Use multiple conditions check box, the Batch Size option in the Advanced Settings tab will no longer be available

 

Skip Rows

Enter the number of lines to be ignored.

 

Max Rows

Maximum number of rows to be processed. If Limit = 0, no row is read or processed.

 

Die on error

Select this check box to skip the row in error and complete the process for error-free rows. If needed, you can retrieve the rows in error via a Row > Rejects link.

Advanced settings

Batch Size

Number of lines in each processed batch.

Note

This option is not displayed if you have cleared the Use multiple conditions check box in the Basic settings view.

 

Loop XPath query

The XML structure node on which the loop is based.

 

Mapping

Column: reflects the schema as defined in the Edit schema editor.

XPath query: Type in the name of the fields to extract from the input XML structure.

Get Nodes: Select this check box to retrieve the Xml node together with the data.

 

tStatCatcher Statistics

Select this check box to gather the processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

NB_LINE: the number of rows processed. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Use this component as a start component. It needs an output flow.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Scenario: Reading master data in an MDM hub

This scenario describes a two-component Job that reads master data on an MDM server. The master data is fetched and displayed in the log console.

  • From the Palette, drop tMDMInput and tLogRow onto the design workspace.

  • Connect the two components together using a Row Main link.

  • Double-click tMDMInput to open the Basic settings view and define the component properties.

  • In the Property Type list, select Built-In to complete the fields manually. If you have stored the MDM connection information in the repository metadata, select Repository from the list and the fields will be completed automatically.

  • In the Schema list, select Built-In and click the three-dot button next to Edit schema to open a dialog box. Here you can define the structure of the master data you want to read on the MDM server.

  • The master data is collected in a three column schema of the type String: ISO2Code, Name and Currency. Click OK to close the dialog box and proceed to the next step.

  • In the URL field, enter between inverted commas the URL of the MDM server.

  • In the Username and Password fields, enter your login and password to connect to the MDM server.

  • In the Entity field, enter between inverted commas the name of the business entity that holds the master data you want to read.

  • In the Data Container field, enter between inverted commas the name of the data container that holds the master data you want to read.

  • In the Component view, click Advanced settings to set the advanced parameters.

  • In the Loop XPath query field, enter between inverted commas the structure and the name of the XML node on which the loop is to be carried out.

  • In the Mapping table and in the XPath query column, enter between inverted commas the name of the XML tag in which you want to collect the master data, next to the corresponding output column name.

  • In the design workspace, click on the tLogRow component to display the Basic settings in the Component view and set the properties.

  • Click on Edit Schema and ensure that the schema has been collected from the previous component. If not, click Sync Columns to fetch the schema from the previous component.

  • Save the Job and press F6 to run it.

The list of different countries along with their codes and currencies is displayed on the console of the Run view.