Creating a connection to an ADLS Databricks cluster - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Before you begin

About this task

To connect to a Databricks cluster on Amazon S3, follow this procedure Adding S3 specific properties to access the S3 system from Databricks.

Procedure

  1. In the DQ Repository tree view, expand Metadata and right-click DB Connections.
  2. Click Create DB connection.
    The Database Connection wizard is displayed.
  3. Enter a name and click Next. The other fields are optional.
  4. Select JDBC as the DB Type.
  5. In the JDBC URL field, enter the URL of your ADLS Databricks cluster. To get the URL:
    1. Go to Azure Databricks.
    2. In the clusters list, click the cluster to which you want to connect.
    3. Expand the Advanced Options section and select the JDBC/ODBC tab.
    4. Copy the content of the JDBC URL field. The URL format is jdbc:spark://<server-hostname>:<port>/default;transportMode=http;ssl=1;httpPath=<http-path>;AuthMech=3.
      Note: To encrypt the token in a safer way, it is recommended to enter the UID and PWD parameters in the Database Connection wizard of Talend Studio.
  6. Go back to the Database Connection wizard.
  7. Paste the JDBC URL.
  8. Add the JDBC driver to the Drivers list:
    1. Click the [+] button. A new line is added to the list.
    2. Click the […] button next to the new line. The Module dialog box is displayed.
    3. In the Platform list, select the JDBC driver and click OK. You are back to the Database Connection wizard.
  9. Click Select class name next to the Driver Class field and select com.simba.spark.jdbc4.Driver.
  10. Enter the User Id and Password.
  11. In Mapping file, select Mapping Hive.
  12. Click Test Connection.
    • If the test is successful, click Finish to close the wizard.
    • If the test fails, verify the configuration.