Configuring tPigLoad - 6.4
- Talend Documentation Team
- Talend Big Data
- Talend Big Data Platform
- Talend Data Fabric
- Talend Open Studio for Big Data
- Talend Real-Time Big Data Platform
- Data Governance > Third-party systems > Processing components (Integration) > Pig components
- Data Quality and Preparation > Third-party systems > Processing components (Integration) > Pig components
- Design and Development > Third-party systems > Processing components (Integration) > Pig components
- Talend Studio
Double-click tPigLoad to open its
Click the button next to Edit schema to open the schema editor.
button twice to add two rows and name them Name and State, respectively.
Click OK to validate these changes and
accept the propagation prompted by the pop-up dialog box.
In the Mode area, select Map/Reduce because the Hadoop to be used in this
scenario is installed in a remote machine. Once selecting it, the parameters
to be set appear.
In the Distribution and the Version lists, select the Hadoop distribution to
In the Load function list, select
In the NameNode URI field
and the Resource Manager field, enter the
locations of the NameNode and the ResourceManager to be used for Map/Reduce,
respectively. If you are using WebHDFS, the location should be
webhdfs://masternode:portnumber; if this WebHDFS is secured
with SSL, the scheme should be swebhdfs and you need to use
a tLibraryLoad in the Job to load the library required by
the secured WebHDFS.
In the Input file URI field, enter the
location of the data to be read from HDFS. In this example, the location is
In the Field separator field, enter the