Linking the components

Pig

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Real-Time Big Data Platform
Talend Open Studio for Big Data
Talend Big Data
Talend Data Fabric
Talend Big Data Platform
task
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components
EnrichPlatform
Talend Studio

Procedure

  1. In the Integration perspective of Talend Studio , create an empty Job, named pigweather for example, from the Job Designs node in the Repository tree view.
    For further information about how to create a Job, see the Talend Studio User Guide.
  2. Drop two tPigLoad components, tPigMap and two tPigStoreResult onto the workspace.
    The components can be labelled if needs be. In this scenario, we label the two tPigLoad components as traffic and event, respectively, which load accordingly the traffic data and the related event data. Then we label the two tPigStoreResult components as normal and jam, respectively, which write accordingly the results to the Hadoop distribution to be used. For further information about how to label a component, see the Talend Studio User Guide.
  3. Right-click the tPigLoad component labeled traffic to connect it to tPigMap using the Row > Pig combine link from the contextual menu.
  4. Repeat this operation to link the tPigLoad component labeled event to tPigMap, too. As this is the second link created, it becomes automatically the lookup link.
  5. Use the Row > Pig combine link again to connect tPigMap to each of the two tPigStoreResult components.
    You need to name these links in the dialog box popped up once you select the link type from the contextual menu. In this scenario, we name the link to tPigStoreResult labeled normal as out and the link to tPigStoreResult labeled jam as reject.