Loading the student-friend sample data

Pig

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Real-Time Big Data Platform
Talend Open Studio for Big Data
Talend Big Data
Talend Data Fabric
Talend Big Data Platform
task
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components
EnrichPlatform
Talend Studio

Procedure

  1. Double-click the second tPigLoad component to open its Component view.
  2. Click the [...] button next to Edit schema to open the schema editor.
  3. Click the [+] button twice to add two rows and in the Column column, rename them to student and friend, respectively.
  4. Click OK to validate these changes and accept the propagation prompted by the pop-up dialog box.
  5. In the Mode area, select Map/Reduce.
    This component reuses the Hadoop connection you have configured in that main tPigLoad component. Therefore, the Distribution and the Version fields have been automatically filled with the values from that main loading component.
  6. In the Load function field, select the PigStorage function to read the source data.
  7. In the Input file URI field, enter the directory where the source data is stored. As explained previously, this data is from the second relation containing the student and friend sample data.