MapR Tips for starting with a MapR 5.0.0 sandbox

author
Frédérique Martin Sainte-Agathe
EnrichVersion
6.4
6.3
6.2
6.1
EnrichProdName
Talend Open Studio for Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Big Data
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs > Hadoop distributions > MapR
EnrichPlatform
Talend Studio

MapR Tips for starting with a MapR 5.0.0 sandbox

This article provides step-by-step instructions for setting up a MapR 5.0.0 sandbox to use as a demo cluster to connect to the Talend Studio and run your first Big Data Jobs.
Environment

This article was written and tested against a MapR 5.0.0 sandbox.

Installing and configuring the MapR sandbox

Download the sandbox and follow the instructions on the MapR Sandbox page.

If you don't want to use the default MapR user provided in the sandbox, create a new user in your VM.

Connect as the root user and execute:

useradd username --uid 1000
mkdir /mapr/demo.mapr.com/user/username
chown username:username /mapr/demo.mapr.com/user/username
mkdir /mapr/demo.mapr.com/user/distro_test
chown username:username /mapr/demo.mapr.com/user/distro_test
STAGING=/tmp/hadoop-yarn/staging
HISTORY=/var/mapr/cluster/yarn/rm/staging/history
hadoop dfs -mkdir -p $STAGING
hadoop dfs -chown -R mapr:mapr $STAGING
hadoop dfs -chmod -R 777 $STAGING
hadoop dfs -mkdir -p $HISTORY/done_intermediate
hadoop dfs -mkdir -p $HISTORY/done
hadoop dfs -chown -R mapr:mapr $HISTORY/
hadoop dfs -chmod -R 1777 $HISTORY/done_intermediate
hadoop dfs -chmod -R 750 $HISTORY/done
Installing the MapR client for Windows

You also need to install the MapR client, as detailed on the Setting Up the Client page.

After installation, update the core-site.xml file, located in C:\opt\mapr\hadoop\hadoop-2.7.0\etc\hadoop, by adding the following properties:

<property>
<name>hadoop.spoofed.user.uid</name>
<value>2000</value>
</property>

<property>
<name>hadoop.spoofed.user.gid</name>
<value>2000</value>
</property>

<property>
<name>hadoop.spoofed.user.username</name>
<value>mapr</value>
</property>

This configuration is valid for the default mapr user provided in the sandbox. For another specific user, replace the 2000 values with your preferred user ID and group ID.

Save the file and reboot the sandbox.

Verifying the MapR client installation

Before installing the MapR client, you created an environment variable named MAPR_HOME.

In a Windows terminal, execute the following commands to verify that HDFS works:

> cd %MAPR_HOME%
> cd hadoop\hadoop-2.7.0\bin
> hadoop fs -ls /user/mapr

You should have the following result:

Still in your Windows terminal, execute the following commands to verify that YARN works:

>yarn jar C:\opt\mapr\hadoop\hadoop-2.7.0\share\hadoop\mapreduce\hadoop-mapreduce-examples-2.7.0-mapr-1501.jar pi 16 1000

You should have the following result:

You can also connect to Hue to verify that your Job is running.

You can now create metadata in the Talend Studio to connect to the MapR sandbox.

Related Articles

MapR: Connecting a MapR distribution to the Talend Studio using cluster metadata