Configuring the connection to the Azure Data Lake Storage service to be used by Spark - Cloud - 8.0

Azure Data Lake Store

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Cloud storages > Azure components > Azure Data Lake Storage Gen2 components
Data Quality and Preparation > Third-party systems > Cloud storages > Azure components > Azure Data Lake Storage Gen2 components
Design and Development > Third-party systems > Cloud storages > Azure components > Azure Data Lake Storage Gen2 components
Last publication date
2023-06-07

Procedure

  1. Double-click tAzureFSConfiguration to open its Component view.
    Spark uses this component to connect to the Azure Data Lake Storage system to which your Job writes the actual business data.
  2. From the Azure FileSystem drop-down list, select Azure Datalake Storage to use Data Lake Storage as the target system to be used.
  3. In the Datalake storage account field, enter the name of the Data Lake Storage account you need to access.
    Ensure that the administrator of the system has granted your Azure account the appropriate access permissions to this Data Lake Storage account.
  4. In the Client ID and the Client key fields, enter, respectively, the authentication ID and the authentication key generated upon the registration of the application that the current Job you are developing uses to access Azure Data Lake Storage.

    Ensure that the application to be used has appropriate permissions to access Azure Data Lake. You can check this on the Required permissions view of this application on Azure. For further information, see Azure documentation Assign the Azure AD application to the Azure Data Lake Storage account file or folder.

    This application must be the one to which you assigned permissions to access your Azure Data Lake Storage in the previous step.

  5. In the Token endpoint field, copy-paste the OAuth 2.0 token endpoint that you can obtain from the Endpoints list accessible on the App registrations page on your Azure portal.