Scenario: Transforming a list of files as data flow - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

The following scenario describes a Job that iterates on a list of files, picks up the filename and current date and transforms this into a flow, that gets displayed on the console.

  • Drop the following components: tFileList, tIterateToFlow and tLogRow from the Palette to the design workspace.

  • Connect the tFileList to the tIterateToFlow using an iterate link and connect the Job to the tLogRow using a Row main connection.

  • In the tFileList Component view, set the directory where the list of files is stored.

  • In this example, the files are three simple .txt files held in one directory: Countries.

  • No need to care about the case, hence clear the Case sensitive check box.

  • Leave the Include Subdirectories check box unchecked.

  • Then select the tIterateToFlow component et click Edit Schema to set the new schema

  • Add two new columns: Filename of String type and Date of date type. Make sure you define the correct pattern in Java.

  • Click OK to validate.

  • Notice that the newly created schema shows on the Mapping table.

  • In each cell of the Value field, press Ctrl+Space bar to access the list of global and user-specific variables.

  • For the Filename column, use the global variable: tFileList_1CURRENT_FILEPATH. It retrieves the current filepath in order to catch the name of each file, the Job iterates on.

  • For the Date column, use the Talend routine:TalendDate.getCurrentDate() (in Java)

  • Then on the tLogRow component view, select the Print values in cells of a table check box.

  • Save your Job and press F6 to execute it.

The filepath displays on the Filename column and the current date displays on the Date column.