tHiveWarehouseConfiguration properties for Apache Spark Batch - Cloud - 8.0

Hive

Version
Cloud
8.0
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Database components > Hive components
Data Quality and Preparation > Third-party systems > Database components > Hive components
Design and Development > Third-party systems > Database components > Hive components

These properties are used to configure tHiveWarehouseConfiguration running in the Spark Batch Job framework.

The Spark Batch tHiveWarehouseConfiguration component belongs to the Storage family.

The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.

Basic settings

Property Type

Select the way the connection details will be set.

  • Built-In: The connection details will be set locally for this component. You need to specify the values for all related connection properties manually.

  • Repository: The connection details stored centrally in Repository > Metadata will be reused by this component.

    You need to click the [...] button next to it and in the pop-up Repository Content dialog box, select the connection details to be reused, and all related connection properties will be automatically filled in.

Distribution and Version

Select the Hadoop distribution you are using for Hive.

Select the version of the Hadoop distribution you are using.

Hive Server

Select the Hive server through which you want the Job using this component to execute queries on Hive.

Host Enter the database server IP address.
Port Enter the listening port number of the database server.
Database Enter the name of the database.
Username and Password Enter the user authentication data of the database.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

Additional JDBC Settings Specify additional connection properties for the database connection you are creating.
Use Kerberos authentication

If you are accessing a Hive Metastore running with Kerberos security, select this check box.

Then you need to enter the Hive principal that should have been defined in the hive-site.xml file of the cluster to be used.

Hive principal uses the value of hive.metastore.kerberos.principal. This is the service principal of the Hive Metastore.

Use SSL encryption Select this check box to enable the SSL or TLS encrypted connection.
Then in the fields that are displayed, provide the authentication information:
  • In the Trust store path field, enter the path, or browse to the TrustStore file to be used. By default, the supported TrustStore types are JKS and PKCS 12.
  • To enter the password, click the [...] button next to the Trust store password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Usage rule

This component is generally used with other Hive components, particularly tHiveWarehouseInput and tHiveWarehouseOutput.