The Remote Engine Gen2 is configured to function with 8G memory allocated to it. This has consequences in terms of the number of pipeline and preparation executions that can happen concurrently on the engine.
If too many executions requests are received by the engine, the engine will probably accept a number of pipeline executions and reject some of the last ones. If so, you will get the following error:
Cannot submit pipeline <PIPELINE_NAME>, too many Livy sessions are used.
where <PIPELINE_NAME> is the name you gave to your pipeline.
For safety reasons the number of concurrent pipeline executions is limited, but this limit is configurable for the Remote Engine Gen2.
Configure the number of allowed concurrent executions
To do so, open the following file to edit it:
- <engine_directory>/default/.env if you are using the engine in the AWS USA, AWS Europe, AWS Asia-Pacific or Azure regions.
- <engine_directory>/eap/.env if you are using the engine as part of the Early Adopter Program.
And check the line:
Depending on the resources available on the machine your engine is running on, you might want to change this value. The value corresponds to the following formula to make sure that only a certain amount of memory is dedicated to running pipelines, the rest being available for other services of the engine:
LIVY_SERVER_SESSION_MAX_CREATION=(memory - 4)/spark.driver.memory
where memory corresponds to the memory allocated to the engine, 4 corresponds to the 4GB of memory necessary for other services of the engine, and spark.driver.memory corresponds to the memory allocated to each pipeline execution (1GB by default).
The spark.driver.memory default value can be changed by adding the parameter and value in the Advanced configuration section of the Add run profile form in Talend Cloud Management Console.
You have installed the engine on a Docker environment that has 8GB of memory, you are going to allocate 4GB to the spark driver memory so the formula will be (8-4)/4=1: