Skip to main content Skip to complementary content
Close announcements banner

What's new in R2021-03

Big Data: new features

Feature

Description

Available in

Support of Databricks 7.3 LTS with Spark 3.0 components
You can now run Spark Batch and Spark Streaming Jobs on Databricks 7.3 LTS distribution, both on AWS and on Azure for interactive and transient clusters, with Spark 3.0. The following components are supported:
  • tAvroInput and tAvroOutput
  • tAzureFSConfiguration
  • tFileInputDelimited and tFileOutputDelimited
  • tFileInputJSON and tFileOutputJSON
  • tFileInputParquet and tFileOutputParquet
  • tFileInputXML and tFileOutputXML
  • tFixedFlowInput
  • tLogRow
  • tS3Configuration

This feature is now generally available.

All Talend products with Big Data

Support of CDP Public Cloud Data Hub on AWS You can now configure a CDP Public Cloud Data Hub instance on AWS on Cloudera Management Console to run your Job on a remote JobServer in Talend Studio. In that way, this allows you to configure your cluster by choosing directly a Data Hub cluster definition according to your needs (for example Data Engineering for AWS or Data Discovery and Exploration for AWS). Then you only have to import all the configuration files from that cluster in Talend Studio to launch your Job.

This feature allows you to take the advantage from the elasticity of a cloud cluster directly into CDP Public Cloud.

All Talend products with Big Data

Support of Service Account and OAuth2 Access Token authentification for Google Cloud Platform distribution in Spark Batch Jobs You can now authenticate to Google Cloud Platform either with Service Account or with OAuth2 Access Token in Spark Batch Jobs using Dataproc 1.4 version. These authentication methods are available in the Spark configuration view of your Spark Batch Job.

All Talend products with Big Data

Update of tCollectAndCheck component in Spark Jobs You can now directly check data input with tCollectAndCheck in Spark Jobs. The following types of input are supported:
  • Text
  • Parquet
  • MySQL
  • Hive
  • Delta
  • Snowflake
  • Redshift
  • JDBC
  • HBase
For Spark Batch Jobs, the component is now connected as follow:

The component checks that there is exactly the number of rows and that the rows value are correct in Spark Batch Jobs.

For Spark Streaming Jobs, the component checks the data at the end of the Job execution, after a timeout, as follow:

The component checks that the values are correct. It is accepted that the input can be null or that the rows are not unique in Spark Streaming Jobs.

All Talend products with Big Data

Data Integration: new features

Feature

Description

Available in

Enhancement of code dependency management Talend Studio now allows you to create custom routine jars, package multiple user routines into a custom routine jar, and set up custom routine jar dependencies on Jobs and Joblets.

By setting up custom routine jar dependencies on Jobs and Joblets, the code dependencies on Jobs and Joblets become more explicit and this can help reduce dependency conflicts.

Information noteNote: By default, user routines migrated from any previous version of Talend Studio are all saved under the new Code > Global Routines node.

All Talend products with Talend Studio

tELTOracleMap enhancement The ELT Map editor of the tELTOracleMap component now provides a new dialog box Property Settings, which contains two options:
  • Delimited identifiers: With this check box selected, double quotes will be added for all output column names to support delimited identifiers.
  • Automatic alias: With this check box selected, if a schema column has a different name than its database column, an alias will be automatically created in the SQL query for that column.

All Talend products with Talend Studio

Enhancement in Git conflict resolution for tELTMap Talend Studio now supports comparing conflicts for tELTMap in the Job compare editor.

All Talend products with Talend Studio

Google Drive components: new Read Timeout option The Read Timeout option has been added to the Advanced settings view.
This option is available for tGoogleDriveConnection, tGoogleDriveCopy, tGoogleDriveCreate, tGoogleDriveDelete, tGoogleDriveGet, tGoogleDriveList and tGoogleDrivePut.

All Talend products except Talend ESB

Support of Mongo DB 4.4 API for MongoDB and CosmosDB components You can now connect MongoDB and CosmosDB components to the MongoDB 4.4 version.

All Talend products with Big Data

Performance enhancement for MongoDB and CosmosDB components in Standard Jobs MongoDB and CosmosDB components now provide the following options in Standard Jobs:
  • For input components, you can now define set of fields in the documents to be returned from the database with the Specify fields to return option.
  • For output components, when you want to perform action on data, you now have the ability to delete all the documents in the collection to be used with the Delete all documents option.

All Talend products with Big Data

New SingleStore components

The following three new SingleStore components are now available. They provide better performance for loading data to database tables.

  • tSingleStoreBulkExec
  • tSingleStoreOutputBulk
  • tSingleStoreOutputBulkExec

All Talend products with Talend Studio

New option for granting AWS-predefined permissions to S3 resources

A new option, Canned Access Control, is now provided in the following components for you to grant AWS-predefined permissions to S3 resources:
  • tS3BucketCreate
  • tS3Copy
  • tS3Put

All Talend products with Talend Studio

Data Mapper: new features

Feature

Description

Available in

EDIFACT importer You can now import UN/EDIFACT specifications as ZIP files to create structures. Talend Data Mapper supports specifications starting from the release D.96A.

All Talend Platform and Data Fabric products

XPath functions in Distinct Child Element property In the SimpleLoop function, if you select the Element XPath in the Distinct Option field, you can now use an XPath function to define distinct values.

All Talend Platform and Data Fabric products

Application Integration: new features

Feature

Description

Available in

Enhancement of code dependency management Talend Studio now allows you to create custom Bean jars and custom routine jars, bundle multiple Beans or user routines in a custom Bean jar or routine jar, and set up custom Bean jar or routine jar dependencies on Routes and Routelets.
By setting up custom Bean jar or routine jar dependencies on Routes and Routelets, the code dependencies become more explicit and this can help reduce dependency conflicts.

All Talend products with ESB

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!