Accessing files on a Hadoop cluster from your engine - Cloud

Talend Remote Engine Gen2 Quick Start Guide

EnrichVersion
Cloud
EnrichProdName
Talend Cloud
EnrichPlatform
Talend Management Console
Talend Pipeline Designer
task
Deployment > Deploying > Executing Pipelines
Installation and Upgrade

Before you begin

  • Make sure you use a recent version of docker-compose in order to avoid issues of volumes not correctly mounted.
  • Contact your system administrator to get the list of the complete set of Hadoop configuration files (core-site.xml, hdfs-site.xml, etc.).
  • Put these Hadoop configuration files in a folder on your local machine and copy its path.

Procedure

  1. Go to the following folder in the Remote Engine Gen2 installation directory:
    default if you are using the engine in the AWS USA, AWS Europe, AWS Asia-Pacific or Azure regions.

    eap if you are using the engine as part of the Early Adopter Program.

  2. Create a new file and name it:
    docker-compose.override.yml
  3. Edit this file to add the following:
    version: '3.6'
    
    services: 
    
      livy: 
        environment: 
          HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config
        volumes: 
          - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config
       
      component-server: 
        environment: 
          HADOOP_CONF_DIR: file:/opt/my-hadoop-cluster-config
        volumes: 
          - YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER:/opt/my-hadoop-cluster-config

    where YOUR_LOCAL_HADOOP_CONFIGURATION_FOLDER corresponds to the path to the local folder where your Hadoop configuration files are stored.

  4. Save the file to take your changes into account.
  5. Restart your Remote Engine Gen2.
  6. Connect to Talend Cloud Pipeline Designer.
  7. Go to the Connections page and add a new HDFS connection using your engine and your local user name.
  8. Add a new HDFS dataset using the new connection and make sure you use the path to your files (for example hdfs://namenode:8020/user/talend/files).