Skip to main content

Applying a preparation on ADLS Gen2 Delta tables

This scenario retrieves data from an Azure ADLS Gen2 file system, prepares the data, and then displays it.

For more technologies supported by Talend, see Talend components.

This scenario shows how to retrieve a Delta table from an ADLS Gen2 file system, apply a compatible preparation directly in the flow of the Job, and read the resulting data.

The tAzureAdlsGen2Input component allows you to access your Azure storage, and more specifically your Delta tables. By using the tDataprepRun component in the middle of your Job, you can even reuse an existing preparation created in Talend Data Preparation, to transform and clean the data before reading it or outputting it to the destination of your choice.

The following scenario creates a simple Job that:

  • Retrieves customer data from a Databricks Delta table
  • Directly applies a preparation with a compatible schema
  • Reads the data in the output component

In this example, the Delta table contains basic customer information, such as name, age, birthday and phone number amongst other things.

This scenario assumes that a preparation has been created beforehand, on a dataset with the same schema as your input data for the Job. In this case, the existing preparation is called preparation_adlsgen2.

Information noteNote: Having the same schema on both ends ensures a coherent result, but the Job will still run even if the schema is different.

This simple preparation puts last names in upper case, and changes the date format.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!