Complete the Databricks connection configuration in the Spark
configuration tab of the Run view of your Job.
This configuration is effective on a per-Job basis.
Before you begin
Ensure that only one Job is sent to run on
the same Databricks cluster per
time and do not send
another Job before this Job finishes running. Since each run
automatically restarts the cluster, the Jobs that are launched in parallel interrupt
each other and thus cause execution failure.
Enter the basic connection information to Databricks.
In the Endpoint
field, enter the URL address of your Azure Databricks workspace.
This URL can be found in the Overview blade
of your Databricks workspace page on your Azure portal. For example,
this URL could look like https://westeurope.azuredatabricks.net.
In the Cluster ID
field, enter the ID of the Databricks cluster to be used. This ID is
the value of the
property of your Spark cluster. You can find this property on the
properties list in the Environment tab in the
Spark UI view of your cluster.
You can also easily find this ID from
the URL of your Databricks cluster. It is present immediately after
cluster/ in this URL.
Click the [...] button
next to the Token field to enter the
authentication token generated for your Databricks user account. You
can generate or find this token on the User
settings page of your Databricks workspace. For
further information, see Token management from the
In the DBFS dependencies
folder field, enter the directory that is used to
store your Job related dependencies on Databricks Filesystem at
runtime, putting a slash (/) at the end of this directory. For
example, enter /jars/ to store the dependencies
in a folder named jars. This folder is created
on the fly if it does not exist then.
If you need the Job to be resilient to failure, select the Activate checkpointing check box to enable the
Spark checkpointing operation. In the field that is displayed, enter the
directory in which Spark stores, in the file system of the cluster, the context
data of the computations such as the metadata and the generated RDDs of this
For further information about the Spark checkpointing operation, see http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing .