tHMapRecord properties for Apache Spark Streaming - Cloud - 8.0

Data mapping

Version
Cloud
8.0
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Processing components (Integration) > Data mapping
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Data mapping
Design and Development > Third-party systems > Processing components (Integration) > Data mapping
Last publication date
2024-02-29

These properties are used to configure tHMapRecord running in the Spark Streaming Job framework.

The Spark Streaming tHMapRecord component belongs to the Processing family.

The component in this framework is available in Talend Real-Time Big Data Platform and in Talend Data Fabric.

Basic settings

Storage

To connect to an HDFS installation, select the Define a storage configuration component check box and then select the name of the component to use from those available in the drop-down list.

This option requires you to have previously configured the connection to the HDFS installation to be used, as described in the documentation for the tHDFSConfiguration component.

If you leave the Define a storage configuration component check box unselected, you can only convert files locally.

Open Map Editor

Click the [...] button to open the Structure Generate/Select wizard.

You can first select the type of map to create:
  • Standard Map: Map that perform mappings using functions based on xQuery
  • DSQL Map: Map that performs mappings using Data Shaping Query Language.
You can select the Don't ask me again check box to save this preference. For more information about these map types, see Working with maps.
Note: This option is available only if you have installed the R2023-10 Studio monthly update or a later one delivered by Talend. For more information, check with your administrator.

Then you can either have the hierarchical mapper structure generated automatically based on the schema, or select an existing hierarchical mapper structure. You must do this for both the input and output sides of your Map. The following lists the options for the output structure:

  • Generate hierarchical mapper structure based on the schema option: When you connect multiple output connections to the tHMap, the page displays a confirmation message that informs you that the mapper structures are generated based on the output connections.
  • Select an existing hierarchical mapper structure option: You can connect multiple outputs that are payload-based connections to the tHMap. If there is a single payload-type connection, you can select the Allow support for multiple output connections check box. The generated output map inherits from the existing payload structure.

If Talend Studio detects multiple output connections available, the window displays both output structure options without the support for multiple output connections check boxes.

If neither input nor output connection exists, the Structure Selection page is displayed.

Synchronize map with schema connections

Select this check box if you want to automatically regenerate your map's input and output structures after one of the following changes:
  • Connection metadata change
  • Input or output connection added
  • Input or output connection removed
No changes are detected when a connection is activated or deactivated.
If this check box is selected, the map is automatically synchronized when opened from the component after a change. If not, a dialog appears to ask whether you want to synchronize.
Note: For structures with multiple connections, the map can only be synchronized if the structures have the same form as the ones generated by the component configuration wizard. For example, flattening maps with multiple outputs cannot be synchronized automatically.
Map Path

Click [...] to select an existing map. The window displays a wizard that allows you to select a map from the Hierarchical Mapper view of Talend Data Mapper.

Die on error

Select the check box to stop the execution of the Job when an error occurs.

Clear the check box to skip any error and continue the Job execution process.

Usage

Usage rule This component is used with a tHDFSConfiguration component which defines the connection to the HDFS storage, or as a standalone component for mapping local files only.
Drag-and-drop feature If you have an existing tHMapRecord map in the Data Mapping view, you can easily drag-and-drop the map:
  • When you drag-and-drop the tHMapRecord map into the design workspace, the tHMapRecord component is automatically created
  • When you drag-and-drop the tHMapRecord map into an existing tHMapRecord component, the label and the map reference of the component are automatically updated
Usage with Talend Runtime If you want to deploy a Job or Route containing a data mapping component with Talend Runtime, you first need to install the Talend Data Mapper feature. For more information, see Using Talend Data Mapper with Talend Runtime.