Configuring tPigLoad - 7.2


English (United States)
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Talend Studio
Data Governance > Third-party systems > Processing components (Integration) > Pig components
Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
Design and Development > Third-party systems > Processing components (Integration) > Pig components


  1. Double-click tPigLoad to open its Component view.
  2. Click the button next to Edit schema to open the schema editor.
  3. Click the button four times to add four rows and rename them: rowkey, id, name and age. The rowkey column put at the top of the schema to store the HBase row key column, but in practice, if you do not need to load the row key column, you can create only the other three columns in your schema.
  4. Click OK to validate these changes and accept the propagation prompted by the pop-up dialog box.
  5. In the Mode area, select Map/Reduce, as we are using a remote Hadoop distribution.
  6. In the Distribution and the Version fields, select the Hadoop distribution you are using. In this example, we are using HortonWorks Data Platform V1.
  7. In the Load function field, select HBaseStorage. Then, the corresponding parameters to set appear.
  8. In the NameNode URI and the Resource Manager fields, enter the locations of those services, respectively. If you are using WebHDFS, the location should be webhdfs://masternode:portnumber; WebHDFS with SSL is not supported yet.
  9. In the Zookeeper quorum and the Zookeeper client port fields, enter the location information of the Zookeeper service to be used.
  10. If the Zookeeper znode parent location has been defined in the Hadoop cluster you are connecting to, you need to select the Set zookeeper znode parent check box and enter the value of this property in the field that is displayed.
  11. In the Table name field, enter the name of the table from which tPigLoad reads the data.
  12. Select the Load key check box if you need to load the HBase row key column. In this example, we select it.
  13. In the Mapping table, four rows have been added automatically. In the Column family:qualifier column, enter the HBase columns you need to map with the schema columns you defined. In this scenario, we put family1:id for the id column, family2:name for the name column and family1:age for the age column.