Skip to main content Skip to complementary content
Close announcements banner

What's new in R2021-08

Big Data: new features

Feature

Description

Available in

Support of Azure Synapse distribution with Spark 3.0

You can now use Azure Synapse Analytics with Apache Spark pools as a new distribution for your Spark Batch and Spark Streaming Jobs with Spark 3.0 in YARN cluster mode only. You can configure it in the Spark Configuration view of your Spark Jobs, for more information, see Defining the Azure Synapse Analytics connection parameters.

Azure Synapse Analytics allows you to process your data thanks to various analytics engines. With Apache Spark pools you can have various compute capabilities (such as speed and efficiency) and a compatibility with ADLS Gen2 storage.

Information noteImportant: As it is a beta feature only, it is not suitable for production environment.

All Talend products with Big Data

Advanced expressions for tMap join conditions
You can now use the Filter expression area in the Map Editor of tMap component to enter an advanced expression for your join conditions in Spark Batch Jobs using Dataset API. This feature allows you to filter your data from the join condition with the following expressions:
  • simple expression with <, >, <=, >= or == operators
  • complex expression combining several simple expressions with || or && operators
For example, you can use a complex expression on dates:

All Talend products with Big Data

Data Integration: new features

Feature

Description

Available in

Support of removing old versions of multiple project items

A Cleanup view has been added under the General > Version Management node in the Project Settings dialog box, which allows you to

  • remove all old versions and keep only the latest one for multiple project items, or
  • remove all old versions lower than a specific version for multiple project items.
Information noteWarning: There is no dependency check when removing old versions of Jobs, Joblets, Routes, and Routelets. We recommend you to do the removal and validate the cleanup on a branch.

All Talend products with Talend Studio

tCouchbaseInput: support for N1QL for Analytics statement

The tCouchbaseInput component now provides the N1QL for Analytics option in the Query type drop-down list. This option allows you to query semistructured data using N1QL for Analytics statements.

All Talend products with Talend Studio

tSQSOutput: retrieving and passing message IDs to the subsequent components

The tSQSOutput component can now retrieve IDs of the received messages and pass the message IDs to the subsequent components. This is achieved by adding the MessageID column in the output schema. This function is accessible when Use batch mode option is cleared.

All Talend products with Talend Studio

tAzureStoragePut: the use of the + character

The tAzureStoragePut component now provides the Allow to escape the '+' sign in filemask option, which, when selected, treats the + character as a normal character in the Files field. With this option not selected, the + character is treated as a regular expression operator.

All Talend products with Talend Studio

tSAPTableInput: NUMC data mapped as string

The tSAPTableInput component now provides the Read NUMC data as string in the dynamic column option, which, when selected, treats data of the NUMC type in the dynamic column as strings. With this option not selected, data of NUMC type is treated as integers.

All Talend products with Talend Studio

MongoDB components: support for the SCRAM-SHA-256 SASL authentication mechanism for MongoDB 4.4.x and later versions

The MongoDB components now support the SCRAM-SHA-256 SASL authentication mechanism when MongoDB 4.4.x or a later version is used. See Authentication mechanisms for related information.

All Talend products with Big Data

Specifying the SAP connection to be used

The tSAPTableInput and the tELTSAPMap component now provide the Connection id field, through which you can specify the ID of the SAP connection to be used. The SAP connection ID is the name of the SAP connection configuration file.

This field requires that the SAP RFC server patch Patch_20210820_TDI-45536_v1-7.3.1 be applied. The patch allows you to establish multiple SAP connections using an SAP RFC server. For information about configuring SAP RFC server and SAP connections, see Configuring the RFC server.

All Talend products with Talend Studio

tSAPDataSourceReceiver and tSAPIDocReceiver: the capability of processing partner host information

The tSAPDataSourceReceiver and tSAPIDocReceiver components now provide the capability of processing partner host information. The information identifies the SAP connection from which a message is received.

  • The tSAPDataSourceReceiver component provides a new pre-defined column named partnerHost in its schema.
  • The tSAPIDocReceiver component will insert partner host information to the headers of the messages it extracts.

This field requires that the SAP RFC server patch Patch_20210820_TDI-45536_v1-7.3.1 be applied. The patch allows you to establish multiple SAP connections using an SAP RFC server. For information about configuring SAP RFC server and SAP connections, see Configuring the RFC server.

All Talend products with Talend Studio

New components

This release provides the following new components: tFileInputParquet and tFileOutputParquet.

All Talend products with Talend Studio

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!