Configuring the connection to Hive - Cloud - 8.0

Hive

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Database components (Integration) > Hive components
Data Quality and Preparation > Third-party systems > Database components (Integration) > Hive components
Design and Development > Third-party systems > Database components (Integration) > Hive components
Last publication date
2024-04-09

About this task

Configuring tHiveConnection

Procedure

  1. Double-click tHiveConnection to open its Component view.
  2. From the Property type list, select Built-in. If you have created the connection to be used in Repository, then select Repository, click the button to open the Repository content dialog box and select that connection. This way, Talend Studio will reuse that set of connection information for this Job.
    For further information about how to create a Hadoop connection in Repository, see Centralizing Hadoop connections.
  3. In the Version area, select the Hadoop distribution to be used and its version. If you cannot find from the list the distribution corresponding to yours, select Custom so as to connect to a Hadoop distribution not officially supported in Talend Studio.
    For a step-by-step example about how to use this Custom option, see connecting-to-a-custom-hadoop-distribution_standard_component_depending_t.html.
  4. In the Connection area, enter the connection parameters to the Hive database to be used.
  5. In the Name node field, enter the location of the master node, the NameNode, of the distribution to be used. For example, talend-hdp-all:50300. If you are using WebHDFS, the location should be webhdfs://masternode:portnumber; WebHDFS with SSL is not supported yet.
  6. In the Job tracker field, enter the location of the JobTracker of your distribution. For example, hdfs://talend-hdp-all:8020.
    Note that the notion Job in this term JobTracker designates the MR or the MapReduce jobs described in Apache's documentation on http://hadoop.apache.org/.