Skip to main content Skip to complementary content

Configuring the connection to the Azure Data Lake Storage service to be used by Spark

Procedure

  1. Double-click tAzureFSConfiguration to open its Component view.
    Spark uses this component to connect to the Azure Data Lake Storage system to which your Job writes the actual business data.
  2. From the Azure FileSystem drop-down list, select Azure Datalake Storage to use Data Lake Storage as the target system to be used.
  3. In the Datalake storage account field, enter the name of the Data Lake Storage account you need to access.
    Ensure that the administrator of the system has granted your Azure account the appropriate access permissions to this Data Lake Storage account.
  4. In the Client ID and the Client key fields, enter, respectively, the authentication ID and the authentication key generated upon the registration of the application that the current Job you are developing uses to access Azure Data Lake Storage.

    Ensure that the application to be used has appropriate permissions to access Azure Data Lake. You can check this on the Required permissions view of this application on Azure. For further information, see Azure documentation Assign the Azure AD application to the Azure Data Lake Storage account file or folder.

    This application must be the one to which you assigned permissions to access your Azure Data Lake Storage in the previous step.

  5. In the Token endpoint field, copy-paste the OAuth 2.0 token endpoint that you can obtain from the Endpoints list accessible on the App registrations page on your Azure portal.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!