Transforming a list of files as data flow - Cloud - 8.0

Orchestration (Integration)

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Orchestration components (Integration)
Data Quality and Preparation > Third-party systems > Orchestration components (Integration)
Design and Development > Third-party systems > Orchestration components (Integration)
Last publication date
2024-02-29

The following scenario describes a Job that iterates on a list of files, picks up the filename and current date and transforms this into a flow, that gets displayed on the console.

For more technologies supported by Talend, see Talend components.

  • Drop the following components: tFileList, tIterateToFlow and tLogRow from the Palette to the design workspace.

  • Connect the tFileList to the tIterateToFlow using an iterate link and connect the Job to the tLogRow using a Row main connection.

  • In the tFileList Component view, set the directory where the list of files is stored.

  • In this example, the files are three simple .txt files held in one directory: Countries.

  • No need to care about the case, hence clear the Case sensitive check box.

  • Leave the Include Subdirectories check box unchecked.

  • Then select the tIterateToFlow component et click Edit Schema to set the new schema

  • Add two new columns: Filename of String type and Date of date type. Make sure you define the correct pattern in Java.

  • Click OK to validate.

  • Notice that the newly created schema shows on the Mapping table.

  • In each cell of the Value field, press Ctrl+Space bar to access the list of global and user-specific variables.

  • For the Filename column, use the global variable: tFileList_1CURRENT_FILEPATH. It retrieves the current filepath in order to catch the name of each file, the Job iterates on.

  • For the Date column, use the Talend routine: Talend Date.getCurrentDate() (in Java)

  • Then on the tLogRow component view, select the Print values in cells of a table check box.

  • Save your Job and press F6 to execute it.

The filepath displays on the Filename column and the current date displays on the Date column.