What's new in R2021-02 - 7.3

Talend Big Data products Release Notes

Version
7.3
Language
English (United States)
EnrichDitaval
Big Data
Product
Talend Big Data
Talend Big Data Platform
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Content
Installation and Upgrade
Release Notes

Big Data: new features

Feature

Description

Product

Support of Spark 3.0 in local mode for Spark Jobs Talend now support Spark 3.0 in local mode when running Spark Jobs in Talend Studio.
Note: The following elements do not support Spark 3.0 in local mode:
  • ADLS Gen2
  • tCassandraInput and tCassandraOutput
  • tElasticSearchInput and tElasticSearchOutput

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support of Databricks 7.3 LTS with Spark 3.0 components (technical preview)
As technical preview, you can now run Spark Batch and Spark Streaming Jobs on Databricks 7.3 LTS distribution, both on AWS and on Azure for interactive and transient clusters, with Spark 3.0. The following components are supported:
  • tAvroInput and tAvroOutput
  • tAzureFSConfiguration
  • tFileInputDelimited and tFileOutputDelimited
  • tFileInputJSON and tFileOutputJSON
  • tFileInputParquet and tFileOutputParquet
  • tFileInputXML and tFileOutputXML
  • tFixedFlowInput
  • tLogRow
  • tS3Configuration
Important: As it is technical preview only, it is not suitable for production environment.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

New options available for transient Databricks clusters You can now fine tune your configuration when you create a transient Databricks cluster from the Spark configuration view of your Spark Job. The following properties are now available:
  • Enable credentials passthrough
  • Spot with fall back to On-demand
  • Availability zone
  • Max spot price
  • EBS volume type
  • Custom tags
  • Init scripts

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Inherit credentials from AWS role option available for DynamoDB components in Spark Batch Jobs The following DynamoDB components now support the ability to obtain AWS security credentials from Amazon EC2 instance metadata with the new Inherit credentials from AWS role option:
  • tDynamoDBInput
  • tDynamoDBOutput
  • tDynamoDBConfiguration

This allows you not to specify any access key or secret key in Talend Studio.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Integration: new features

Feature

Description

Product

Libraries sharing further enhancement

Talend Studio now supports configuring whether to share component libraries to the local libraries repository at startup via the Share libraries to artifact repository at startup check box on Talend > Artifact Repository > Libraries view in the Preferences dialog box.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Support for Databricks Delta Lake mapping

The support for Databricks Delta Lake mapping is provided by the following omponents.

  • tELTInput, tELTOutput, tELTMap
  • tSQLTemplate, tSQLTemplateMerge, tSQLTemplateAggregate, tSQLTemplateCommit, tSQLTemplateRollback, tSQLTemplateFilterRows, tSQLTemplateFilterColumns

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

New options for Update and Delete operations provided

The Use WHERE conditions table option and the Where conditions table field are provided in the Basic settings view. The change improves the productivity. Components involved:

  • tELTGreenplumOutput, tELTMSSqlOutput, tELTMysqlOutput, tELTNetezzaOutput, tELTOracleOutput, tELTOutput, tELTPostgresqlOutput, tELTSybaseOutput, tELTTeradataOutput, tELTVerticaOutput

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tRedshiftBulkExec: new file type supported

The tRedshiftBulkExec component can load data stored in Apache Parquet files.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tFileOutputExcel: new option provided for Excel2007 files

The tFileOutputExcel components provides the Truncate characters exceeding max cell length option, which prevents failures that occur when a string written to an Excel2007 cell exceeds the maximum length allowed (that is, 32767 characters).

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

tChangeFileEncoding: buffer size customizable

The tChangeFileEncoding component provides the Buffer Size field, allowing you to specify the buffer size for changing the file encoding.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Safety Switch option available to tSalesforceBulkExec and tSalesforceOutputBulkExec

The Safety Switch option is now provided for the tSalesforceBulkExec and tSalesforceOutputBulkExec components to prevent excessive memory usage. When the database contains columns that are longer than 100000 characters, do not use this option.

Talend Big Data

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Mapper: new features

Feature

Description

Product

New options for decimal elements In the CSV, Flat, JSON, Map and XML representation properties, two new options have been added to handle decimal elements and fix an issue related to implied decimals:
  • The Enforce zero scale on output decimals? option allows you to remove fractional digits when the Decimal Places property is set to 0.
  • The Decimal sign is implied on output options allows you to remove the decimal sign in the output.

Talend Big Data Platform

Talend Real-Time Big Data Platform

Data Quality: new features

Feature

Description

Product

Support of Spark 3.0 in local mode Spark components support Apache Spark 3.0 in local mode, except for tMatchIndex, tMatchIndexPredict, tNLPModel, tNLPPredict and tNLPPreprocessing.

Talend Big Data Platform

Talend Real-Time Big Data Platform