About Databricks clusters - 7.3

Databricks

EnrichVersion
Cloud
7.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
EnrichPlatform
Talend Studio
task
Design and Development > Designing Jobs > Hadoop distributions > Databricks
Design and Development > Designing Jobs > Serverless > Databricks
The information in this section is only for users of File or of any subscription-based Big Data but it is not applicable to Talend Open Studio for Big Data users. It is also only for users who run their Spark Jobs on Databricks distributions, both on Azure and AWS.
Databricks clusters are a set of computation resources and configurations on which you can run your Spark Streaming and Spark Batch Jobs. In Talend Studio you can either run your Spark Job on an interactive cluster or on a transient cluster.
Note: By default, Spark Jobs run on an interactive cluster. You can manage this in the Spark configuration tab in the Run view of your Spark Job. For more information, see Defining the Azure Databricks connection parameters for Spark Jobs.

When you run a Job on an interactive cluster in Talend Studio, you can basically run any workload. Interactive clusters are created for an undetermined duration, but you can manually terminate and restart them if needed. Multiple users can share such clusters to do collaborative and interactive analytics.

When you run a Job on a transient cluster in Talend Studio, you process the Job faster and the cluster automatically shuts down and when processing is finished for a lower cost of usage. Transient clusters are created according to your Spark configuration and you cannot restart them once shut down.