Configuring tPigStoreResult

Pig

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Talend Data Fabric
Talend Big Data
Talend Big Data Platform
task
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components
EnrichPlatform
Talend Studio

Two tPigStoreResult components are used to write each of the sorted data into HDFS.

  1. Double-click either the first tPigStoreResult component to open its Component view to write the data sorted by name.
  2. In the Result file field, enter the directory where the data will be written. This directory will be created if it does not exist. In this scenario, we put /user/ychen/sort/tPigreplicate/byName.csv.
  3. Select Remove result directory if exists.
  4. In the Store function list, select PigStorage.
  5. In the Field separator field, enter the semicolon ;.
  6. Do the same for the other tPigStoreResult component but set another directory for the data sorted by state. In this scenario, it is /user/ychen/sort/tPigreplicate/byState.csv.