It is recommended to activate the Spark logging and checkpointing system in the Spark configuration tab of the Run view of your Spark Job, in order to help debug and resume your Spark Job when issues arise.
If you need the Job to be resilient to failure, select the Activate checkpointing check box to enable the
Spark checkpointing operation. In the field that is displayed, enter the
directory in which Spark stores, in the file system of the cluster, the context
data of the computations such as the metadata and the generated RDDs of this
For further information about the Spark checkpointing operation, see http://spark.apache.org/docs/latest/streaming-programming-guide.html#checkpointing .
In the Yarn client mode, you can
enable the Spark application logs of this Job to be persistent in the file
system. To do this, select the Enable Spark event
logging check box.
The parameters relevant to Spark logs are displayed:
Spark event logs directory: enter the directory in which Spark events are logged. This is actually the spark.eventLog.dir property.
Spark history server address: enter the location of the history server. This is actually the spark.yarn.historyServer.address property.
Compress Spark event logs: if needs be, select this check box to compress the logs. This is actually the spark.eventLog.compress property.
Since the administrator of your cluster could have defined these properties in the cluster configuration files, it is recommended to contact the administrator for the exact values.