Linking the components

Pig

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Data Fabric
Talend Real-Time Big Data Platform
Talend Big Data Platform
Talend Open Studio for Big Data
Talend Big Data
task
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components
EnrichPlatform
Talend Studio

Procedure

  1. In the Integration perspective of the Studio, create an empty Job from the Job Designs node in the Repository tree view.
    For further information about how to create a Job, see Talend Studio User Guide.
  2. In the workspace, enter the name of the component to be used and select this component from the list that appears. In this scenario, the components are two tPigLoad components, a tPigCoGroup component and a tPigStoreResult component. One of the two tPigLoad components is used as the main loading component to connect to the Hadoop cluster to be used.
  3. Connect the main tPigLoad component to tPigCoGroup using the Row > Main link.
  4. Do the same to connect the second tPigLoad component to tPigCoGroup. The Lookup label appears over this link.
  5. Repeat the operation to connect tPigCoGroup to tPigStoreResult.