Big Data
Big Data Platform
Cloud Big Data
Cloud Big Data Platform
Cloud Data Fabric
Data Fabric
Real-Time Big Data Platform
The support for Oozie in the Studio is deprecated from Talend 7.2 onwards.
Use Talend Administration Center to start, monitor and schedule the executions of your Big Data Jobs.
Procedure
-
Click the Oozie schedule view beneath the
design workspace.
Example
-
Click Setting to open the connection setup
dialog box.
Example
-
Set up the Oozie connection.
-
If you have set up the Oozie connection in the Repository as explained in Centralizing an Oozie connection (Deprecated), you can easily reuse it. To do this, select Repository from the Property type drop-down list, then click the [...] button to open the Repository Content dialog box and select the Oozie connection to be used.
-
If you have not set up the Oozie connection, fill in the connection information in the corresponding fields as explained in the table below:
Field/Option Description Hadoop distribution
Hadoop distribution to be connected to. This distribution hosts the HDFS file system to be used. If you select Custom to connect to a custom Hadoop distribution, then you need to click the [...] button to open the Import custom definition dialog box and from this dialog box, to import the jar files required by that custom distribution.
For further information, see Connecting to custom Hadoop distribution..
Hadoop version
Version of the Hadoop distribution to be connected to. This list disappears if you select Custom from the Hadoop distribution list.
Enable kerberos security If you are accessing the Hadoop cluster running with Kerberos security, select this check box, then, enter the Kerberos principal name for the NameNode in the field displayed. This enables you to use your user name to authenticate against the credentials stored in Kerberos.
This check box is available depending on the Hadoop distribution you are connecting to.
User Name Login user name.
Name node end point URI of the name node, the centerpiece of the HDFS file system.
Job tracker end point URI of the Job Tracker node, which farms out MapReduce tasks to specific nodes in the cluster.
Oozie end point URI of the Oozie web console, for Job execution monitoring.
Hadoop Properties If you need to use custom configuration for the Hadoop of interest, complete this table with the property or properties to be customized. Then at runtime, these changes will override the corresponding default properties used by the Studio for its Hadoop engine.
For further information about the properties required by Hadoop, see Apache's Hadoop documentation on http://hadoop.apache.org, or the documentation of the Hadoop distribution you need to use.
Note:Settings defined in this table are effective on a per-Job basis.
-
Results
Upon defining the deployment path in the Oozie scheduler view, you are ready to schedule executions of your Job, or run it immediately, on the HDFS server.