Big Data Platform
Cloud Big Data
Cloud Big Data Platform
Cloud Data Fabric
Real-Time Big Data Platform
About this task
Setting up a connection to a given Hadoop distribution in Repository allows you to avoid configuring that connection each time when you need to use the same Hadoop distribution.
You need to define a Hadoop connection before being able to create from the Hadoop cluster node the connections to each individual Hadoop element such as HDFS or Hive.
Ensure that the client machine on which the Talend Studio is installed can recognize the host names of the nodes of the Hadoop cluster to be used. For this purpose, add the IP address/hostname mapping entries for the services of that Hadoop cluster in the hosts file of the client machine.
For example, if the host name of the Hadoop Namenode server is talend-cdh550.weave.local and its IP address is 192.168.x.x, the mapping entry reads 192.168.x.x talend-cdh550.weave.local.
The Hadoop cluster to be used has been properly configured and is running.
The Integration perspective is active.
If you need to connect to MapR from the Studio, ensure that you have installed the MapR client in the machine where the Studio is, and added the MapR client library to the PATH variable of that machine. According to MapR's documentation, the library or libraries of a MapR client corresponding to each OS version can be found under MAPR_INSTALL\/hadoop\hadoop-VERSION/lib/native. For example, the library for Windows is \lib\native\MapRClient.dll in the MapR client jar file. For further information, see the following link from MapR: http://www.mapr.com/blog/basic-notes-on-configuring-eclipse-as-a-hadoop-development-environment-for-mapr.
To create a Hadoop connection in the Repository, do the following:
- In the Repository tree view of your Studio, expand Metadata and then right-click Hadoop cluster.
- Select Create Hadoop cluster from the contextual menu to open the Hadoop cluster connection wizard.
Fill in generic information about this connection, such as Name and Description and click
Next to open the Hadoop
Configuration Import Wizard window that allows you to select the
manual or the automatic mode to configure the connection.