Talend Data Preparation is able to connect to various databases and use them as source to create a new dataset.
In this example, you want to prepare some customers data that is stored on an Azure Data Lake Storage Gen2. You will enter your connection information, directly in the Talend Data Preparation interface and create a new dataset from this data.
Before you begin
- In the Datasets view of the Talend Data Preparation homepage, click the white arrow next to the Add Dataset button.
Select Azure DLS Gen2.
The Add Azure DLS Gen2 dataset form opens.
- In the Dataset name field, enter the name you want to give your dataset.
- Enter the Account name of the account you want to access.
Select your Authentication type from the drop-down
- If you select Shared Key, enter your Account key.
- If you select Shared Access Signature, enter your Azure Shared Access Signature.
- If you select Azure Active Directory, enter your Tenant ID, Client ID, and Client Secret in the corresponding field.
Click Test connection.
If the connection is successful, the second part of the form is displayed, where you can enter a query or directly choose a Salesforce module from the list proposed. If not, an error message is displayed, detailing why the connection failed.
- Enter the Container and Blob path where the data is located.
- Select the format fo the source data between CSV, Avro, Json or Parquet.
- Click the Add dataset button at the end of the form.
The data is still stored in ADLS Gen2, Talend Data Preparation only retrieves a sample on-demand.
The dataset is added to the list in the Datasets view of the homepage.