Resolve the hdp.version variable issue for Spark Jobs - 6.4

The HDP version variable issue in MapReduce Jobs and Spark Jobs

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs > Hadoop distributions > Hortonworks
Design and Development > Designing Jobs > Job Frameworks > MapReduce
Design and Development > Designing Jobs > Job Frameworks > Spark Batch
Design and Development > Designing Jobs > Job Frameworks > Spark Streaming
EnrichPlatform
Talend Studio

Procedure

  1. Define the hdp.version parameter in your cluster.

    The easiest way of doing this is to add this parameter to the yarn-site.xml configuration file.

    1. In Ambari, click the Yarn service on the service list on the left, then click Configs to open the configuration page and click the Advanced tab.
    2. Scroll down the page to find the Custom yarn-site list at the end of page and click Custom yarn-site to show this list.
    3. Click Add property to open the [Add property] dialog box.
    4. Enter hdp.version=2.6.0.3-8, the version number you found by following Find the hdp.version value to be used and click Add to validate the changes. The hdp.version parameter appears in the Custom yarn-site parameter list.
    5. Click Save to validate the new configuration and restart the services to implement the hdp.version parameter in the yarn-site.xml file.
  2. In the Studio, open the Spark Job to be used and click the Run tab to open its view.
  3. Click Spark configuration, then in the view, select the Set hdp.version check box and enter, within double quotation marks, the same version number you have entered in the cluster. In this example, it is 2.6.0.3-8.

    This procedure explains only the actions to be performed to solve the HDP version issue for a Spark Job. You need properly configure the other parts of your Job before being able to run it successfully.