Feature |
Description |
Available in |
---|---|---|
Support of Spark Universal | You can now run your Spark Jobs using Spark Universal with Spark 2.4.x or
Spark 3.0.x, either in Local or Yarn
cluster mode. Spark Universal is a mechanism that allows Talend Studio to be compatible with every big data distribution available for a given Spark version, using only a Hadoop configuration JAR file that contains all the necessary information to establish a connection to the cluster in Yarn cluster. Spark Universal gives you more agility by enabling a switch between the different Spark modes, distributions or environments. You can configure your Spark Universal connection either
in the Spark configuration view of your Job or in the
Hadoop Cluster Connection metadata wizard from the
Repository tree view:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of Kubernetes with Spark Universal 3.1.x | You can now run your Spark Jobs using Spark Universal with Spark 3.1.x in
Kubernetes mode. You can configure your Spark
Universal connection with Kubernetes either in the Spark
configuration view of your Job or in the Hadoop
Cluster Connection metadata wizard from the
Repository tree view:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of Dynamic Schema in Spark Batch components | You can now use the Dynamic Schema in your Spark Jobs with the following
components:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of new distributions Delivered in 7.3 monthly releases |
You can use the following distributions for your Spark Jobs:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of Spark 3.0 in local mode for Spark Jobs Delivered in 7.3 R2021-02 monthly release |
Talend now supports Spark 3.0 in local mode when running Spark Jobs in
Talend Studio. Note: The
following elements do not support Spark 3.0 in local mode:
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of Knox for CDP Public Cloud Data Hub on AWS Delivered in 7.3 R2021-06 monthly release |
When you use a CDP Public Cloud Data Hub instance on AWS with CDP 7.1 and
onwards in YARN cluster and HDFS modes, you can now authenticate using Knox
either in the Spark configuration view of your Spark
Jobs or in the Hadoop Cluster Connection metadata wizard
from the Repository tree view. Knox allows you to
provide a single point of authentication only using SSO.
|
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |
Support of Hive Warehouse Connector with Cloudera CDP 7.1.x Delivered in 7.3 R2021-10 monthly release |
You can now use the Hive Warehouse Connector to get data from and write data to Hive transactional managed tables in Spark Batch Jobs with the following new components:
With Hive Warehouse Connector, Talend Studio supports Hive transactional managed tables which allows you to have a more optimal transaction control over your data. |
ⓘ Available in: Big Data Big Data Platform Cloud Big Data Cloud Big Data Platform Cloud Data Fabric Data Fabric Real-Time Big Data Platform All subscription-based Talend products with Big Data |