Skip to main content Skip to complementary content
Close announcements banner

Centralizing a Hadoop connection

About this task

Setting up a connection to a given Hadoop distribution in Repository allows you to avoid configuring that connection each time when you need to use the same Hadoop distribution.

You need to define a Hadoop connection before being able to create from the Hadoop cluster node the connections to each individual Hadoop element such as HDFS or Hive.

Prerequisites:
  • Ensure that the client machine on which the Talend Studio is installed can recognize the host names of the nodes of the Hadoop cluster to be used. For this purpose, add the IP address/hostname mapping entries for the services of that Hadoop cluster in the hosts file of the client machine.

    For example, if the host name of the Hadoop Namenode server is talend-cdh550.weave.local and its IP address is 192.168.x.x, the mapping entry reads 192.168.x.x talend-cdh550.weave.local.

  • The Hadoop cluster to be used has been properly configured and is running.

  • The Integration perspective is active.

  • If you need to connect to MapR from Talend Studio, ensure that you have installed the MapR client in the machine where Talend Studio is, and added the MapR client library to the PATH variable of that machine. According to MapR documentation, the library or libraries of a MapR client corresponding to each OS version can be found under MAPR_INSTALL/hadoop/hadoop-VERSION/lib/native. For example, the library for Windows is \lib\native\MapRClient.dll in the MapR client JAR file.

To create a Hadoop connection in the Repository, do the following:

Procedure

  1. In the Repository tree view of Talend Studio, expand Metadata and then right-click Hadoop cluster.
  2. Select Create Hadoop cluster from the contextual menu to open the Hadoop cluster connection wizard.
  3. Fill in generic information about this connection, such as Name and Description and click Next to open the Hadoop Configuration Import Wizard window that allows you to select the manual or the automatic mode to configure the connection.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!