Defining HDFS connection details in Oozie scheduler view (Deprecated) - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data

Big Data Platform

Cloud Big Data

Cloud Big Data Platform

Cloud Data Fabric

Data Fabric

Real-Time Big Data Platform

The support for Oozie in the Studio is deprecated from Talend 7.2 onwards.

Use Talend Administration Center to start, monitor and schedule the executions of your Big Data Jobs.

Procedure

  1. Click the Oozie schedule view beneath the design workspace.

    Example

  2. Click Setting to open the connection setup dialog box.

    Example

  3. Set up the Oozie connection.
    • If you have set up the Oozie connection in the Repository as explained in Centralizing an Oozie connection (Deprecated), you can easily reuse it. To do this, select Repository from the Property type drop-down list, then click the [...] button to open the Repository Content dialog box and select the Oozie connection to be used.

    • If you have not set up the Oozie connection, fill in the connection information in the corresponding fields as explained in the table below:

      Field/Option Description

      Hadoop distribution

      Hadoop distribution to be connected to. This distribution hosts the HDFS file system to be used. If you select Custom to connect to a custom Hadoop distribution, then you need to click the [...] button to open the Import custom definition dialog box and from this dialog box, to import the jar files required by that custom distribution.

      For further information, see Connecting to custom Hadoop distribution..

      Hadoop version

      Version of the Hadoop distribution to be connected to. This list disappears if you select Custom from the Hadoop distribution list.

      Enable kerberos security

      If you are accessing the Hadoop cluster running with Kerberos security, select this check box, then, enter the Kerberos principal name for the NameNode in the field displayed. This enables you to use your user name to authenticate against the credentials stored in Kerberos.

      This check box is available depending on the Hadoop distribution you are connecting to.

      User Name

      Login user name.

      Name node end point

      URI of the name node, the centerpiece of the HDFS file system.

      Job tracker end point

      URI of the Job Tracker node, which farms out MapReduce tasks to specific nodes in the cluster.

      Oozie end point

      URI of the Oozie web console, for Job execution monitoring.

      Hadoop Properties

      If you need to use custom configuration for the Hadoop of interest, complete this table with the property or properties to be customized. Then at runtime, these changes will override the corresponding default properties used by the Studio for its Hadoop engine.

      For further information about the properties required by Hadoop, see Apache Hadoop documentation.

      Note:

      Settings defined in this table are effective on a per-Job basis.

Results

Upon defining the deployment path in the Oozie scheduler view, you are ready to schedule executions of your Job, or run it immediately, on the HDFS server.