Big Data: new features
Feature |
Description |
Available in |
---|---|---|
Lightweight dependencies for CDH 6.x |
When you run a Job on CDH 6.x distributions, you can
reduce the time spent by the Job to be launched when you select the
Use lightweight dependencies
check box from the Spark
configuration tab in the Run view. It reduces the number of libraries to only
the Talend libraries. This could prevent issues about dependencies,
missing signature, wrong JAR version or missing JAR for example. With
this option, you can use another classpath, different from the Cloudera
default one, by selecting the Use custom
classpath check box and entering the JARs you want to
use in a regex syntax separated by a comma.
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Customizing precision in the schema for output components | You can now select a precision different from the
standard for the BigDecimal type when you update the output schema for the
following components:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
tS3Configuration: setting the name of the DynamoDB table in EMRFS | When you use the EMRFS consistent view option, you can enter the name of the metadata DynamoDB table to be used. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
tDeltaLakeInput and tDeltaLakeOutput: new path available to store the data | You can specify an external path to another filesystem
different from the DBFS (ADLS Gen2 or S3) to store the data. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
tDeltaLakeOutput: new operations provided in the Action property | You can drop a table which results in the deletion and
recreation of the table. You can also truncate a table which results in the
deletion of data while the schema remains. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
tDeltaLakeOutput: new optimize property provided in the Basic settings view | You can optimize the layout of Delta Lake data on
Databricks. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Using a Hadoop configuration file with Spark Batch and Spark Streaming Jobs | You can connect Spark Batch and Spark Streaming Jobs to a Hadoop cluster in the Repository using a configuration JAR file. You specify the path to this file either in the Spark Configuration of the Job or in the Hadoop cluster configuration. This option is available for both Yarn cluster and Yarn client on non-Cloud distributions. Optionally, you can contextualize this connection parameter to automatically connect to the right cluster based on the environment in which you run the Job. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support for High Availability for EMR 5.23 or later |
High availability is available when you run Talend Jobs with the Amazon EMR distribution in version 5.23 or later. You can now have multiple master nodes in your cluster. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Data Integration: new features
Feature |
Description |
Available in |
---|---|---|
tDataprepRun enhancement | The tDataprepRun component now shows an error message
when creating a new preparation with dynamic schema. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform MDM Platform Real-Time Big Data Platform All subscription-based Talend products except Talend ESB |
tELTMap enhancement | In the ELT Map Editor of the tELTMap component, you can
now enter a multi-line expression for an output column and access proposals
including input columns, output columns, and context variables by pressing
Ctrl+Space in a new pop-up dialog box. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
POM generation enhancement | A new option Exclude deleted
items is available for generating the POM file for Maven
build. With this option selected, the modules for the deleted items will be
excluded from the POM file of current project, and the source for the
deleted test cases will not be generated. Note: You need to resynchronize the POM file to apply the new
settings of this option.
|
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Talend type mapping enhancement | You can now set the default pattern for each date type
in Talend type mapping
file. This allows the date pattern for the date type columns to be set
automatically when retrieving or guessing schema from a table. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tSSH library upgraded |
Ganymed is now deprecated and the component now supports a new library: Apache mina-sshd. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tSalesforceInput: new query mode provided |
The tSalesforceInput component provides the BulkV2 query mode, which allows you to query even larger amount of data. The component also provides the Split query results into small sets option for the BulkV2 mode, allowing you to split query results into sets that are of specific size. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Formatting date using date pattern defined in schema |
The tSnowflakeOutput and the tSnowflakeOutputBulkExec components provide the Use schema date pattern option, allowing you to format dates using the date pattern defined in the schema. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tSnowflakeInput: option renamed and improved |
The Allow snowflake to convert columns and tables to uppercase option changes to Use unquoted object identifiers, with the function improved. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tFTPRename: table column name fixed |
The tFTPRename component supports only filenames in the Files field and the column name Filemask changes to Filename. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tS3Connection: path-style access supported |
The tS3Connection component provides support for path-style access. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
tMongoDBOutput: action on data fields customizable |
For Upsert with set action, you can specify whether or not a field can be updated/inserted. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Data Quality: new features
Feature |
Description |
Available in |
---|---|---|
Phone number standardization | Phone numbers can now be validated for a given region:
The Google libphonenumber library has also been updated to the most recent version. |
ⓘ Available in: Big Data Platform Cloud API Services Platform Cloud Big Data Platform Cloud Data Fabric Cloud Data Management Platform Data Fabric Data Management Platform Data Services Platform MDM Platform Real-Time Big Data Platform All Talend Platform and Data Fabric products |
Application Integration: new features
Feature |
Description |
Available in |
---|---|---|
Microservices |
Camel metrics are now exposed to Prometheus in
Microservices to monitor the execution of Routes, JVM memory, CPU
consumption, and so on. |
ⓘ Available in: Cloud API Services Platform Cloud Data Fabric Data Fabric Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with ESB |
Continuous Integration: new features
Feature |
Description |
Available in |
---|---|---|
POM file generation - new parameter | The mvn
org.talend.ci:builder-maven-plugin:7.3.3:generateAllPoms
command allows you to re-generate all .pom files of a project before building it. It is also useful
in case you want to test a new version of the product before migrating
it. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Custom script - new parameter | The mvn
org.talend.ci:builder-maven-plugin:7.3.3:executeScript command
allows you to write your own script with CommandLine commands and execute it
at build time. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Camel metrics exposure to Prometheus - new parameter | You are now able to use the -Dstudio.prometheus.metrics=true parameter while publishing
ESB artifacts to Docker in order to expose Camel metrics to Prometheus and
get more details about the deployed Routes. |
ⓘ Available in: Cloud API Services Platform Cloud Data Fabric Data Fabric Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with ESB |
Debug mode - new parameter | You are now able to use the -Dstudio.talendDebug=true parameter to get additional logs.
This parameter could be useful when trying to debug build issues with the
support team. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |
Build improvement | Depending on your Talend Studio project settings, all project items that are located in the recycle bin can now be excluded from the Continuous Integration build. |
ⓘ Available in: Big Data Big Data Platform Cloud API Services Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Cloud Data Integration Cloud Data Management Platform Data Fabric Data Integration Data Management Platform Data Services Platform ESB MDM Platform Real-Time Big Data Platform All subscription-based Talend products with Talend Studio |