Defining the final output schema and the output file

Pig

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Real-Time Big Data Platform
Talend Open Studio for Big Data
Talend Big Data
Talend Data Fabric
Talend Big Data Platform
task
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click tPigFilterColumns to open its Basic settings view.
  2. Click the [...] button next to Edit schema to open the [Schema] dialog box.
  3. From the input schema, select the columns you want to include in your result file by clicking them one after another while pressing the Shift key, and click the [->] button to copy them to the output schema. Then, click OK to validate the schema setting and close the dialog box.
    In this example, we want the result file to include all the information except the group IDs.
  4. Double-click tPigStoreResult to open its Basic settings view.
  5. Click Sync columns to retrieve the schema structure from the preceding component.
  6. Fill in the Result file field with the full path to the result file, and select the Remove result file directory if exists check box.
  7. Select PigStorage from the Store function list, and leave rest of the settings as they are.