How to view the data stored in Hadoop from the Studio - 6.2

Talend Data Fabric Studio User Guide

EnrichVersion
6.2
EnrichProdName
Talend Data Fabric
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Warning

The information in this section is for subscription-based Talend Studio users only and is not applicable to Talend Open Studio for Big Data users.

In designing or executing a Job, you often need to view the source data to be processed or the output data after the execution. The Data viewer allows you to access the data directly from the Studio, without using any specific extra tools.

For most cases, the only prerequisite is that you must have properly configured the connection from this Job to the Hadoop distribution where the data of interest is stored.

  • To view the source data, right-click the input component, for example, tHDFSInput, you are using in the workspace and from the contextual menu, select Data viewer.

  • To view the output data after an execution, right-click the output component, for example, tHDFSOutput, you are using in the workspace and from the contextual menu, select Data viewer.

The following image is the example of the Data viewer with data read from a remote Hadoop server. The schema of the data is defined in the component from which the Data Viewer is called.

Note that if you are using a MapR Hadoop distribution, that is to say, you have to install its MapR client in the machine where the Studio is, you need to set the -Djava.library.path argument for the Studio to access the native library of that MapR client.

  1. From the menu bar of the Studio, click Window > Preferences to open the [Preferences] dialog box.

  2. Expand Talend and select Run/Debug.

  3. In the Job Run VM arguments area, click New to display the [Set the VM Argument] dialog box.

  4. Set the -Djava.library.path argument, for example, enter -Djava.library.path=C:\opt\mapr\lib\native\Windows_7, if you are using Windows 7, or enter -Djava.library.path=/opt/mapr/lib if you are using Linux.

  5. Click OK to close this dialog box and this argument is added to the Argument table.

  6. Click Apply to validate these changes and click OK to close the [Preferences] dialog box.