You can chose to set Google Cloud Dataflow as Big Data export runtime for your preparations.
To configure this new runtime instead of the default one, you must perform some Streams Runner and Spark Job Server configuration.
Before you begin
- You have a Google Cloud enterprise account and have created a Google Cloud project.
- You have installed Talend Data Preparation.
- You have installed Streams Runner and Spark Job Server on Linux machines.
- You have created a service account on Google Cloud and downloaded the
.json file containing the credentials for this service
account. This file must be stored on the same machine where the Spark Job Server was installed. The
service account must have the right to run Jobs on Google Cloud Dataflow and
access buckets involved in your Jobs in Google Cloud Storage, such as your input
and output buckets, as well as the bucket set for
- Open the <Streams_Runner_installation_path>/conf/application.conf file.
To set Google Dataflow as runner type, you can either:
DataflowRunneras value for the
- Use the
$(?RUNNER_TYPE)environment variable by executing the following command:
Configure the runner properties by adding the two mandatory parameter and their
values to the configuration file, namely
In addition to these two parameters, you can complete the runner configuration with other parameters of your choice. For a complete list of the available execution parameters, see the Google documentation.
To configure the Spark Job Server, add the
GOOGLE_APPLICATION_CREDENTIALSenvironment variable by executing the following command:
The variable must point to the .json file that contains the credentials for your Google Cloud service account. This .json file must be located on the machine where the Spark Job Server is installed.
- Restart the services.
When exporting a preparation, the Google Cloud Dataflow runtime will be used instead of the default Big Data runtime, depending on the data input and output. For more information on which runtime will be used according to your input and output, see Export options and runtimes matrix.