Configuring Big Data run profiles - Cloud

Talend Cloud Management Console for Pipelines User Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Administration and Monitoring > Managing projects
Administration and Monitoring > Managing users
Deployment > Deploying > Executing Tasks
Deployment > Scheduling > Scheduling Tasks
EnrichPlatform
Talend Management Console

Procedure

  1. In the Basic configuration section, enter the number of micro-batch intervals in milliseconds.
  2. Define the streaming timeout in milliseconds.
  3. Optional: Define the Yarn queue.
  4. Optional: Enter the number of driver cores to use for the driver process.
  5. Optional: Define the amount of memory to use for the driver process (where SparkContext is initialized), in megabytes.
  6. Define the path to a temporary storage in which to store the local system the temporary files such as the jar files to be transferred.
  7. Select the Yarn strategy from the drop-down list.
    • Dynamic: Dynamic resource allocation scales the number of executors registered up and down, based on the workload.
    • Fixed: You have a static number of executors, regardless of the workload.
  8. Optional: If you chose the dynamic mode, configure the dynamic allocation parameters.
    1. Define the initial number of executors.
    2. Define the upper bound for the number of executors.
    3. Define the lower bound for the number of executors.
  9. Optional: If you chose the fixed mode, configure the number of executors.
  10. Enter the number of cores to be used by each executor.
  11. Enter the memory size to be used by each Spark executor, in megabytes
  12. Enter the amount of off-heap memory to be allocated per executor, in megabytes.
    This is memory that accounts for things like VM overheads, interned strings, other native overheads, and so on. This tends to grow with the executor size (typically 6-10%).
  13. Optional: Enable Checkpointing to help Spark Streaming checkpoint enough information to a fault- tolerant storage system so that it can recover from failures.
  14. Optional: Enter the path to the checkpoint file.
  15. Optional: In the Advanced configuration section, click ADD PARAMETER to create a parameter.
  16. Optional: Enter the parameter key and value for each new parameter.
    This step is mandatory if you have enabled checkpointing.

    Example

    To set the amount of memory to use per executor process, enter spark.executor.memory to the parameter key and 16g to the value fields.
  17. Click SAVE.