Managing Hadoop metadata - Cloud - 8.0

Talend Studio User Guide

Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Talend Studio
Design and Development
Last publication date
Available in...

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

In the Repository tree view, the Hadoop cluster node in the Metadata folder contains the metadata of the connections to the Hadoop elements such as HDFS, Hive or HBase. It allows you to centralize the connection properties for a given Hadoop distribution and then to reuse those properties to create separate connections to each Hadoop element.

Click Metadata in the Repository tree view to expand the relevant folder. Each connection node will contain the connections and schemas you have set up. Among these connection nodes is theHadoop cluster node.

Hadoop Properties dialog box.

The following sections explain in detail how to use the Hadoop cluster node to set up:

  • an HBase connection,

  • an HCatalog connection,

  • an HDFS file schema,

  • a Hive connection.

If you need to create a connection to Cloudera's analytic database, Impala, you must use the DB connection node under the Metadata node of the Repository. Its configuration is similar to that of a Hive connection but less complicated than the latter.

For further information about this DB connection node, see Centralizing database metadata.