Scenario: Selecting a column of data from an input file and store it into a local file - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario describes a three-component Job that selects a column of data that matches filter condition defined in tPigCode and stores the result into a local file.

Setting up the Job

  1. Drop the following components from the Palette to the design workspace: tPigCode, tPigLoad, tPigStoreResult.

  2. Right-click tPigLoad to connect it to tPigCode using a Row > Pig Combine connection.

  3. Right-click tPigCode to connect it to tPigStoreResult using a Row > Pig Combine connection.

Loading the data

  1. Double-click tPigLoad to open its Basic settings view.

  2. Click the three-dot button next to Edit schema to add columns for tPigLoad.

  3. Click the plus button to add Name, Country and Age and click OK to save the setting.

  4. Select Local from the Mode area.

  5. Fill in the Input filename field with the full path to the input file.

    In this scenario, the input file is CustomerList which contains rows of names, country names and age.

  6. Select PigStorage from the Load function list.

  7. Leave rest of the settings as they are.

Configuring the tPigCode component

  1. Double-click tPigCode component to open its Basic settings view.

  2. Click Sync columns to retrieve the schema structure from the preceding component.

  3. Fill in the Script Code field with following expression:

    tPigCode_1_row2_RESULT = foreach tPigLoad_1_row1_RESULT generate $0 as name;

    This filter expression selects column Name from CustomerList.

Saving the result data to a local file

  1. Double-click tPigStoreResult to open its Basic settings view.

  2. Click Sync columns to retrieve the schema structure from the preceding component.

  3. Fill in the Result file field with the full path to the result file.

    In this scenario, the result is saved in Result file.

  4. Select Remove result directory if exists.

  5. Select PigStorage from the Store function list.

  6. Leave rest of the settings as they are.

Executing the Job

Save your Job and press F6 to run it.

The Result file is generated containing the selected column of data.