Creating a Job script to filter data records - Cloud - 8.0

Talend Job Script Reference Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend CommandLine
Talend Studio
Content
Design and Development > Designing Jobs
Last publication date
2024-02-22
This example shows how to write a Job script to define a Job that will read a CSV file and filter the data records based on given conditions, and then display the summary information: the total number of records read from the source file, the number of accepted records, and the number of rejected records.

The Job will contain the following components:

  • a tFileInputDelimited component to read the source CSV file that contains people information. The source file contains five columns, as shown below:

    name;gender;age;city;marriageStatus
    Van Buren;M;73;Chicago;married
    Adams;M;40;Albany;single
    Jefferson;F;66;New York;married
    Adams;M;9;Albany;-
    Jefferson;M;30;Chicago;single
    Carter;F;26;Chicago;married
    Harrison;M;40;New York;married
    Roosevelt;F;15;Chicago;
    Monroe;M;8;Boston;-
    Arthur;M;20;Albany;married
    Pierce;M;18;New York;-
    Quincy;F;83;Albany;married
    McKinley;M;70;Boston;married
    Coolidge;M;4;Chicago;-
    Monroe;M;60;Chicago;single
    ----- end of file --------
  • a tReplicate component, to duplicate the input data into two output flows, one of which is displayed on the console as unprocessed data, and the other goes to a column filter for processing.

  • a tFilterColumns component, to remove an unwanted column, marriageStatus.

  • a tFilterRow component, to filter the data output two tables:

    • one lists all male persons with a last name shorter than nine characters and aged between 10 and 80 years.

    • the other lists all rejected records, with an error message for each rejected record to explain why the record has been rejected.

  • three tLogRow components: the first one to display the unprocessed data, the second one to display the accepted records, and the third one to display the rejected records and the corresponding error messages.

  • a tJava component, to display the summary information.

The procedures below demonstrate how to write this Job script in the Job script editor, starting from adding the required components. For how to create an empty Job script, see How to create a Job script.