Creating a Hadoop cluster metadata definition - 8.0

First steps using Big Data in Talend Studio

Version
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs > Hadoop distributions
Last publication date
2024-02-06

You can create a Hadoop cluster metadata definition to be able to quickly configure component with your Hadoop cluster information. Talend Studio also allows you to import a cluster metadata definition.

Before you begin

  • This tutorial makes use of a Hadoop cluster. You must have a Hadoop cluster available to you.
  • Select the Integration perspective (Window > Perspective > Integration).

Procedure

  1. In the Repository, expand Metadata, right-click Hadoop Cluster and click Create Hadoop Cluster.
  2. In the Name field, enter a name.

    Example

    MyHadoopCluster
  3. Optional: In the Purpose field, enter a purpose.

    Example

    Cluster connection metadata
  4. Optional: In the Description field, enter a description.

    Example

    Metadata to connect to a Amazon EMR cluster
    Tip: Enter a Purpose and Description to stay organized.
  5. Click Next.
  6. Select a Distribution.

    Example

    Select Amazon EMR and EMR 5.15.0 (Hadoop 2.8.3).
  7. Select a Version.

    Example

    Select EMR 5.15.0 (Hadoop 2.8.3).
  8. Select Enter manually Hadoop services.
  9. Click Finish.
    You are brought to the Hadoop Cluster Connection window.
  10. Enter your Connection details.

    Example

    • Namecode URI: hdfs://hadoopcluster:8020
    • Resource Manager: hadoopcluster:8032
    • Resource Manager Scheduler: hadoopcluster:8030
    • Job History: hadoopcluster:10020
    • Staging directory: /user
  11. Enter your Authentication details.

    Example

    • User name: student
  12. Optional: Click Check Services.
  13. Click Finish.

Results

The Hadoop cluster metadata definition appears in the Repository.