Linking the components - 7.2

Pig

author
EnrichVersion
7.2
task

Procedure

  1. In the Integration perspective of Talend Studio , create an empty Job, named Replicate for example, from the Job Designs node in the Repository tree view.
    For further information about how to create a Job, see the Talend Studio User Guide.
  2. Drop tPigLoad, tPigReplicate, two tPigSort and two tPigStoreResult onto the workspace.
    The tPigLoad component reads data from the given HDFS system. The sample data used in this scenario reads as follows:
    Andrew Kennedy;Mississippi
    Benjamin Carter;Louisiana
    Benjamin Monroe;West Virginia
    Bill Harrison;Tennessee
    Calvin Grant;Virginia
    Chester Harrison;Rhode Island
    Chester Hoover;Kansas
    Chester Kennedy;Maryland
    Chester Polk;Indiana
    Dwight Nixon;Nevada
    Dwight Roosevelt;Mississippi
    Franklin Grant;Nebraska
    The location of the data in this scenario is /user/ychen/raw/Name&State.csv.
  3. Connect them using the Row > Pig combine links.