Set up the hdp.version parameter to resolve the Hortonworks version issue
The Studio retrieves these Hortonworks configuration files along with this hdp.version variable from a Hortonworks cluster. When you define the Hadoop connection in the Studio, the Studio generates a hadoop-conf-xxx.jar using these configuration files and adds this JAR file, thus along with this variable, to the classpath of your Job. This may lead to the following known issue:
[ERROR]: org.apache.spark.SparkContext - Error initializing SparkContext. org.apache.spark.SparkException: hdp.version is not found, Please set HDP_VERSION=xxx in spark-env.sh, or set -Dhdp.version=xxx in spark.{driver|yarn.am}.extraJavaOptions or set SPARK_JAVA_OPTS="-Dhdp.verion=xxx" in spark-env.sh If you're running Spark under HDP. at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:999) at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:171) at org.apache.spark.scheduler.cluster.YarnClientSchedulerBackend.start(YarnClientSchedulerBackend.scala:56) at org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:156) at org.apache.spark.SparkContext.<init>(SparkContext.scala:509) at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:58) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.runJobInTOS(test_hdp_1057.java:1454) at dev_v6_001.test_hdp_1057_0_1.test_hdp_1057.main(test_hdp_1057.java:1341)
org.apache.spark.SparkException: Yarn application has already ended! It might have been killed or unable to launch application master. Diagnostics: Exception from container-launch. Container id: container_1496650120478_0011_02_000001 Exit code: 1 Exception message: /hadoop/yarn/local/usercache/abbass/appcache/application_1496650120478_0011/container_1496650120478_0011_02_000001/launch_container.sh: line 21: $PWD:$PWD/__spark_conf__:$PWD/__spark_libs__/*:$HADOOP_CONF_DIR:/usr/hdp/current/hadoop-client/*:/usr/hdp/current/hadoop-client/lib/*:/usr/hdp/current/hadoop-hdfs-client/*:/usr/hdp/current/hadoop-hdfs-client/lib/*:/usr/hdp/current/hadoop-mapreduce-client/*:/usr/hdp/current/hadoop-mapreduce-client/lib/*:/usr/hdp/current/hadoop-yarn-client/*:/usr/hdp/current/hadoop-yarn-client/lib/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure: bad substitution
java.lang.IllegalArgumentException: Unable to parse '/hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework' as a URI, check the setting for mapreduce.application.framework.path at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:443) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:142) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1290) at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1287) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapreduce.Job.submit(Job.java:1287) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:575) at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:570) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:570) at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:561) at org.talend.hadoop.mapred.lib.MRJobClient.runJob(MRJobClient.java:46) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runMRJob(test_mr_hdp26.java:1556) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.access$2(test_mr_hdp26.java:1546) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1194) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26$1.run(test_mr_hdp26.java:1) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.tRowGenerator_1Process(test_mr_hdp26.java:1044) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.run(test_mr_hdp26.java:1524) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.runJobInTOS(test_mr_hdp26.java:1483) at dev_v6_001.test_mr_hdp26_0_1.test_mr_hdp26.main(test_mr_hdp26.java:1431) Caused by: java.net.URISyntaxException: Illegal character in path at index 11: /hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework at java.net.URI$Parser.fail(URI.java:2848) at java.net.URI$Parser.checkChars(URI.java:3021) at java.net.URI$Parser.parseHierarchical(URI.java:3105) at java.net.URI$Parser.parse(URI.java:3063) at java.net.URI.<init>(URI.java:588) at org.apache.hadoop.mapreduce.JobSubmitter.addMRFrameworkToDistributedCache(JobSubmitter.java:441) ... 27 more
These error messages can appear together or separately.
Environment:
Subscription-based Talend Studio solution with Big Data
Spark or MapReduce Jobs
Find the hdp.version value to be used
Procedure
You may directly ask the administrator of your cluster about the correct version to use.
- You can also check the /usr/hdp/ directory of each machine in your cluster. This folder usually contains several versions and a symbolic link reading
current
points to the latest value you should use. For example:[root@sandbox /]# ls -lth /usr/hdp/ total 16K drwxr-xr-x 50 root root 4.0K Jun 5 07:59 2.6.0.3-8 drwxr-xr-x 32 root root 4.0K May 5 13:19 2.5.0.0-1245 drwxr-xr-x 3 root root 4.0K May 5 13:18 share drwxr-xr-x 2 root root 4.0K May 5 12:48 current -> 2.6.0.3-8
In this example, the version to be used is hdp.version=2.6.0.3-8.
Results
For Spark Jobs, see Resolve the hdp.version variable issue for Spark Jobs.
For MapReduce Jobs, see Resolve the hdp.version variable issue for MapReduce Jobs.