tMap MapReduce properties - 6.5

tMap

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Processing components (Integration) > tMap
Data Quality and Preparation > Third-party systems > Processing components (Integration) > tMap
Design and Development > Third-party systems > Processing components (Integration) > tMap
EnrichPlatform
Talend Studio

These properties are used to configure tMap running in the MapReduce Job framework.

The MapReduce tMap component belongs to the Processing family.

The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.

Basic settings

Map editor

It allows you to define the tMap routing and transformation properties.

Note: If you do not want to handle execution errors, you can click the Property Settings button at the top of the input area and select the Die on error check box (selected by default) in the [Property Settings] dialog box. It will kill the Job if there is an error.
Note: To maximize the data transformation performance in a Job that handles multiple lookup input flows with large amounts of data, you can select the Lookup in parallel check box in the [Property Settings] dialog box.

However, in a Map/Reduce Job, only one expression key is allowed per mapping component. If you need to use multiple expression keys to join different input tables, use multiple tMap components one after another.

Mapping links display as

Auto: the default setting is curves links

Curves: the mapping display as curves

Lines: the mapping displays as straight lines. This last option allows to slightly enhance performance.

Temp data directory path Enter the path where you want to store the temporary data generated for lookup loading. For more information on this folder, see Talend Studio User Guide.

Preview

The preview is an instant shot of the Mapper data. It becomes available when Mapper properties have been filled in with data. The preview synchronization takes effect only after saving changes.

Use replicated join

Select this check box to perform a replicated join between the input flows. By replicating each lookup table into memory, this type of join doesn't require an additional shuffle-and-sort step, thus speeding up the whole process.

You need to ensure that the entire lookup tables fit in memory.

Advanced settings

Max buffer size (nb of rows) Type in the size of physical memory, in number of rows, you want to allocate to processed data.
Ignore trailing zeros for BigDecimal Select this check box to ignore trailing zeros for BigDecimal data.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usage rule

In a Talend Map/Reduce Job, this component is used as an intermediate step and other components used along with it must be Map/Reduce components, too. They generate native Map/Reduce code that can be executed directly in Hadoop.

As explained earlier, If you need to use multiple expression keys to join different input tables, use mutiple tMap components one after another.

For further information about a Talend Map/Reduce Job, see the sections describing how to create, convert and configure a Talend Map/Reduce Job of the Talend Open Studio for Big Data Getting Started Guide .

Note that in this documentation, unless otherwise explicitly stated, a scenario presents only Standard Jobs, that is to say traditional Talend data integration Jobs, and non Map/Reduce Jobs.