Configuring tHiveLoad

Procedure

Double-click tHiveLoad to open its Component view.
Select the Use an existing connection check box and from Component list, select the connection configured in the tHiveConnection component you are using for this Job.
From the Load action field, select LOAD to write data from the file holding the sample data that is presented at the beginning of this scenario.
In the File path field, enter the directory where the sample data is stored. In this example, the data is stored in the HDFS system to be used. In the real-world practice, you can use tHDFSOutput to write data into the HDFS system and you need to ensure that the Hive application has the appropriate rights and permissions to read or even move the data.

For further information about the related rights and permissions, see the documentation or contact the administrator of the Hadoop cluster to be used.

Note if you need to read data from a local file system other than the HDFS system, ensure that the data to be read is stored in the local file system of the machine in which the Job is run and then select the Local check box in this Basic settings view. For example, when the connection mode to Hive is Standalone, the Job is run in the machine where the Hive application is installed and thus the data should be stored in that machine.
In the Table name field, enter the name of the target table you need to load data in. In this scenario, it is employees.
From the Action on file list, select APPEND.
Select the Set partitions check box and in the field that appears, enter the partition you need to add data to. In this scenario, this partition is country='US'.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!

Leave your feedback here