Before you begin
Ensure that the client machine on which the Talend Jobs are executed can recognize the host names of the nodes of the Hadoop cluster to be used. For this purpose, add the IP address/hostname mapping entries for the services of that Hadoop cluster in the hosts file of the client machine.
For example, if the host name of the Hadoop Namenode server is talend-cdh550.weave.local, and its IP address is 192.168.x.x, the mapping entry reads 192.168.x.x talend-cdh550.weave.local.
The Hadoop cluster to be used has been properly configured and is running.
Double-click the tPigStoreResult which
receives the out1 link.
Its Basic settings view is opened in the lower part of the Studio.
- In the Result file field, enter the directory you need to write the result in. In this scenario, it is /user/ychen/output_data/out, which receives the records that contain the names of the movie directors.
- Select Remove result directory if exists check box.
- In the Store function list, select PigStorage to write the records in human-readable UTF-8 format.
- In the Field separator field, enter ; within double quotation marks.
- Repeat the same operations to configure the tPigStoreResult that receives the rejectreject link, but set the directory, in the Result file field, to /user/ychen/output_data/reject.
- Press F6 to run the Job.
The Run view is automatically opened in the lower part of the Studio and shows the execution progress of this Job.
Once done, you can check, for example in the web console of your HDFS system, that the output has been written in HDFS.