Scenario 1: Iterating on a file directory - 6.1

Talend Components Reference Guide

Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
Talend Studio
Data Governance
Data Quality and Preparation
Design and Development

The following scenario creates a three-component Job, which aims at listing files from a defined directory, reading each file by iteration, selecting delimited data and displaying the output in the Run log console.

Dropping and linking the components

  1. Drop the following components from the Palette to the design workspace: tFileList, tFileInputDelimited, and tLogRow.

  2. Right-click the tFileList component, and pull an Iterate connection to the tFileInputDelimited component. Then pull a Main row from the tFileInputDelimited to the tLogRow component.

Configuring the components

  1. Double-click tFileList to display its Basic settings view and define its properties.

  2. Browse to the Directory that holds the files you want to process. To display the path on the Job itself, use the label (__DIRECTORY__) that shows up when you put the pointer anywhere in the Directory field. Type in this label in the Label Format field you can find if you click the View tab in the Basic settings view.

  3. In the Basic settings view and from the FileList Type list, select the source type you want to process, Files in this example.

  4. In the Case sensitive list, select a case mode, Yes in this example to create case sensitive filter on file names.

  5. Keep the Use Glob Expressions as Filemask check box selected if you want to use global expressions to filter files, and define a file mask in the Filemask field.

  6. Double-click tFileInputDelimited to display its Basic settings view and set its properties.

  7. Enter the File Name field using a variable containing the current filename path, as you filled in the Basic settings of tFileList. Press Ctrl+Space bar to access the autocomplete list of variables, and select the global variable ((String)globalMap.get("tFileList_1_CURRENT_FILEPATH")) . This way, all files in the input directory can be processed.

  8. Fill in all other fields as detailed in the tFileInputDelimited section. Related topic: tFileInputDelimited.

  9. Select the last component, tLogRow, to display its Basic settings view and fill in the separator to be used to distinguish field content displayed on the console. Related topic: tLogRow.

Executing the Job

Press Ctrl + S to save your Job, and press F6 to run it.

The Job iterates on the defined directory, and reads all included files. Then delimited data is passed on to the last component which displays it on the console.