Before you begin
- Make sure you use a recent version of docker-compose in order to avoid issues of volumes not correctly mounted.
- Contact your system administrator to get the list of the complete set of Hadoop configuration files (core-site.xml, hdfs-site.xml, etc.).
- Put these Hadoop configuration files in a folder on your local machine and copy its path.
Go to the following folder in the Remote Engine Gen2 installation
default if you are using the engine in the AWS USA, AWS Europe, AWS Asia-Pacific or Azure regions.
eap if you are using the engine as part of the Early Adopter Program.
Create a new file and name it:
Edit this file to add the following:
version: '3.6' services: livy: environment: HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config volumes: - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config component-server: environment: HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config volumes: - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config
where YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER corresponds to the path to the local folder where your Hadoop configuration files are stored.
- Save the file to take your changes into account.
- Restart your Remote Engine Gen2.
- Connect to Talend Cloud Pipeline Designer.
Go to the Connections
page and add a new HDFS connection using your engine and your local user
Add a new HDFS dataset using the new connection and make sure you use the path
to your files (for example