Creating a connection to an ADLS Databricks cluster - Cloud

Talend Cloud Data Management Platform Studio User Guide

EnrichVersion
Cloud
EnrichProdName
Talend Cloud
EnrichPlatform
Talend Management Console
Talend Studio
task
Design and Development

Before you begin

  • You have selected the Profiling perspective of Talend Studio.
  • You have added the JDBC driver to the Studio.

Procedure

  1. In the DQ Repository tree view, expand Metadata and right-click DB Connections.
  2. Click Create DB connection.
    The Database Connection wizard is displayed.
  3. Enter a name and click Next. The other fields are optional.
  4. Select JDBC as the DB Type.
  5. In the JDBC URL field, enter the URL of your ADLS Databricks cluster. To get the URL:
    1. Go to Azure Databricks.
    2. In the clusters list, click the cluster you want to connect to.
    3. Expand the Advanced Options section and select the JDBC/ODBC tab.
    4. Copy the content of the JDBC URL field. The URL format is jdbc:spark://<server-hostname>:<port>/default;transportMode=http;ssl=1;httpPath=<http-path>;AuthMech=3.
      Note: To encrypt the token in a safer way, it is recommended to enter the UID and PWD parameters in the Database Connection wizard of Talend Studio.
  6. Go back to the Database Connection wizard.
  7. Paste the JDBC URL.
  8. Add the JDBC driver to the Drivers list:
    1. Click the [+] button. A new line is added to the list.
    2. Click the […] button next to the new line. The Module dialog box is displayed.
    3. In the Platform list, select the JDBC driver and click OK. You are back to the Database Connection wizard.
  9. Click Select class name next to the Driver Class field and select com.simba.spark.jdbc4.Driver.
  10. Enter the User Id and Password.
  11. In Mapping file, select Mapping Hive.
    You have the following configuration:
  12. Click Test Connection.
    • If the test is successful, click Finish to close the wizard.
    • If the test fails, verify the configuration.