Connecting to a security-enabled MapR - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

MapR supports the two following methods of authenticating a user and generating a MapR security ticket for this user:

  • a username/password pair

  • Kerberos.

When designing a Job, set up the authentication configuration in the component you are using depending on how your MapR cluster is secured.

For further information about the MapR security mechanism, see MapR security architecture.

For a scenario about how to secure a MapR cluster, see Getting started with MapR security.

Prerequisites:

  • The MapR distribution you are using is from version 4.0.1 onwards and you have selected it as the cluster to connect to in the component to be configured.

  • The MapR cluster has been properly installed and is running.

  • Ensure that you have installed the MapR client in the machine where the Studio is, and added the MapR client library to the PATH variable of that machine. According to MapR's documentation, the library or libraries of a MapR client corresponding to each OS version can be found under MAPR_INSTALL\ hadoop\hadoop-VERSION\lib\native. For example, the library for Windows is \lib\native\MapRClient.dll in the MapR client jar file. For further information, see the following link from MapR: http://www.mapr.com/blog/basic-notes-on-configuring-eclipse-as-a-hadoop-development-environment-for-mapr.

    Without adding the specified library or libraries, you may encounter the following error: no MapRClient in java.library.path.

  • This section explains only the authentication parameters to be used to connect to MapR. You still need to define the other parameters required by your Job.

    For further information, see the documentation about each component you are using.

The different security scenarios you may face with your MapR cluster:

  • When your MapR cluster is secured with Kerberos only, you only need to set up the typical Hadoop Kerberos configuration for your Job in the Studio.

    For an example about how to configure Kerberos authentication for a Talend Job, see How to use Kerberos in Talend Studio with Big Data v6.x on Talend Help Center (https://help.talend.com). Although this article uses Cloudera as example for demonstration, the operations it describes are generic and thus applicable to MapR as well.

  • When your MapR cluster is secured with both the Kerberos mechanism and the MapR ticket security mechanism, you need to accordingly set up the configuration for both of them in your Job in the Studio.

    Regarding the Kerberos configuration on the Studio side, see the same article How to use Kerberos in Talend Studio with Big Data v6.x mentioned previously.

    For details about how to configure the MapR ticket security mechanism in the Studio, see Setting up the MapR ticket authentication.

  • When your MapR cluster is secured with the MapR ticket security mechanism only, proceed as explained in Setting up the MapR ticket authentication to set up the MapR authentication configuration for your Job in the Studio.

Setting up the MapR ticket authentication

You need to set up this configuration in the Basic settings tab of a Hadoop-related component to be used by your Job.

In the tab, you need to proceed as follows:

  1. Select the Force MapR ticket authentication check box to display the related parameters to be defined.

  2. In the Username field, enter the username to be authenticated and in the Password field, specify the password used by this user.

    To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

    A MapR security ticket is generated for this user by MapR and stored in the machine where the Job you are configuring is executed.

  3. If the Group field is available in this tab, you need to enter the name of the group to which the user to be authenticated belongs.

  4. In the Cluster name field, enter the name of the MapR cluster you want to use this username to connect to.

    This cluster name can be found in the mapr-clusters.conf file located in /opt/mapr/conf of the cluster.

  5. In the Ticket duration field, enter the length of time (in seconds) during which the ticket is valid.

The following image shows an example of these MapR ticket authentication parameters from the tHDFSConnection component.

Using a custom MapR security configuration (optional)

If the default security configuration of your MapR cluster has been changed, you need to configure the Job to be executed to take this custom security configuration into account.

MapR specifies its security configuration in the mapr.login.conf file located in /opt/mapr/conf of the cluster. For further information about this configuration file and the Java service it uses behind, see mapr.login.conf and JAAS.

To configure your Job, you need to define the related parameters in the Basic settings tab and the Advanced settings tab of the Component view of the component you want your Job to use to connect to MapR, for example, a tHDFSConnection component or a tPigLoad component.

Proceed as follows to do the configuration:

  1. Verify what has been changed about this mapr.login.conf file.

    You should be able to obtain the related information from the administrator or the developer of your MapR cluster.

  2. If the location of the MapR configuration files has been changed to somewhere else in the cluster, that is to say, the MapR Home directory has been changed, select the Set the MapR Home directory check box and enter the new Home directory. Otherwise, leave this check box clear and the default Home directory is used.

  3. If the login module to be used in the mapr.login.conf file has been changed, select the Specify the Hadoop login configuration check box and enter the module to be called from the mapr.login.conf file. Otherwise, leave this check box clear and the default login module is used.

    For example, enter kerberos to call the hadoop_kerberos module or hybrid to call the hadoop_hybrid module.

The following image shows an example of these advanced authentication parameters from the tHDFSConnection component.