Skip to main content Skip to complementary content
Close announcements banner

tHiveWarehouseConfiguration properties for Apache Spark Batch

These properties are used to configure tHiveWarehouseConfiguration running in the Spark Batch Job framework.

The Spark Batch tHiveWarehouseConfiguration component belongs to the Storage family.

The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.

Basic settings

Property Type

Select the way the connection details will be set.

  • Built-In: The connection details will be set locally for this component. You need to specify the values for all related connection properties manually.

  • Repository: The connection details stored centrally in Repository > Metadata will be reused by this component.

    You need to click the [...] button next to it and in the pop-up Repository Content dialog box, select the connection details to be reused, and all related connection properties will be automatically filled in.

Distribution and Version

Select the Hadoop distribution you are using for Hive.

Select the version of the Hadoop distribution you are using. Hive Warehouse components are only supported with CDP distributions.

Hive Server

Select the Hive server through which you want the Job using this component to execute queries on Hive.

Host Enter the database server IP address.
Port Enter the listening port number of the database server.
Database Enter the name of the database.
Username and Password Enter the user authentication data of the database.

To enter the password, click the [...] button next to the password field, enter the password in double quotes in the pop-up dialog box, and click OK to save the settings.

Additional JDBC Settings Specify additional connection properties for the database connection you are creating.
Use Kerberos authentication

If you are accessing a Hive metastore running with Kerberos security, select this check box.

Then you need to enter the Hive principal that should have been defined in the hive-site.xml file of the cluster to be used.

Hive principal uses the value of hive.metastore.kerberos.principal. This is the service principal of the Hive metastore.

Use SSL encryption Select this check box to enable the SSL or TLS encrypted connection.
Then in the fields that are displayed, provide the authentication information:
  • In the Trust store path field, enter the path, or browse to the TrustStore file to be used. By default, the supported TrustStore types are JKS and PKCS 12.
  • To enter the password, click the [...] button next to the Trust store password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

Global Variables

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl+Space to access the variable list and choose the variable to use from it.

For more information about variables, see Using contexts and variables.

Usage

Usage rule

This component is generally used with other Hive components, particularly tHiveWarehouseInput and tHiveWarehouseOutput.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!