Skip to main content Skip to complementary content

Creating a Hadoop cluster metadata definition

You can create a Hadoop cluster metadata definition to be able to quickly configure component with your Hadoop cluster information. Talend Studio also allows you to import a cluster metadata definition.

Before you begin

  • This tutorial makes use of a Hadoop cluster. You must have a Hadoop cluster available to you.
  • Select the Integration perspective (Window > Perspective > Integration).

Procedure

  1. In the Repository, expand Metadata, right-click Hadoop Cluster and click Create Hadoop Cluster.
  2. In the Name field, enter a name.

    Example

    MyHadoopCluster
  3. Optional: In the Purpose field, enter a purpose.

    Example

    Cluster connection metadata
  4. Optional: In the Description field, enter a description.

    Example

    Metadata to connect to a Amazon EMR cluster
    Information noteTip: Enter a Purpose and Description to stay organized.
  5. Click Next.
  6. Select a Distribution.

    Example

    Select Amazon EMR and EMR 5.15.0 (Hadoop 2.8.3).
  7. Select a Version.

    Example

    Select EMR 5.15.0 (Hadoop 2.8.3).
  8. Select Enter manually Hadoop services.
  9. Click Finish.
    You are brought to the Hadoop Cluster Connection window.
  10. Enter your Connection details.

    Example

    • Namecode URI: hdfs://hadoopcluster:8020
    • Resource Manager: hadoopcluster:8032
    • Resource Manager Scheduler: hadoopcluster:8030
    • Job History: hadoopcluster:10020
    • Staging directory: /user
  11. Enter your Authentication details.

    Example

    • User name: student
  12. Optional: Click Check Services.
  13. Click Finish.

Results

The Hadoop cluster metadata definition appears in the Repository.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!