Hortonworks ships a Spark-specific hive-site.xml file to
resolve this Hive-on-Tez issue. You can use this file to define the connection to your
Hortonworks cluster in Talend Studio.
This file is stored in the Spark configuration folder of your Hortonworks cluster: /etc/spark/conf.
Procedure
-
Obtain this Spark-specific Hive configuration file from the administrator of
your cluster.
-
Download the regular Hive configuration files from your cluster, for example,
using Ambari.
-
Among these files, replace the
/etc/hive/conf/hive-site.xml file with this Spark-specific
/etc/spark/conf/hive-site.xml file.
-
Define the Hadoop connection to your Hortonworks cluster in the Repository if you have not done so.
-
Right-click this connection and from the contextual menu, select Edit Hadoop cluster to open the Hadoop cluster connection wizard.
-
Click Next to open the second step of this wizard and select the Use custom Hadoop configurations check box.
-
Click the [...] button next to Use custom
Hadoop configurations to open the Hadoop configuration
import wizard.
-
Select the Hortonworks version you are using and then select the
Import configuration from local files radio button.
-
Click Next and click Browse... to
find the Hive configuration files among which you placed the Spark-specific
hive-site.xml file in one of the previous steps.
-
Click Finish to close the import wizard and thus finish
the import to go back to the Hadoop cluster connection
wizard.
-
Click Finish to validate the changes and in the pop-up
dialog box, click Yes to accept the propagation. Then the
wizard is closed and the Spark-specific Hive configuration file is going to be
used along with this Hadoop connection.
This new configuration is effective only for the Jobs that use this connection.