Writing data to HDFS using metadata - 8.0

First steps using Big Data in Talend Studio

Version
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs > Hadoop distributions
Last publication date
2024-02-06

Using the tHDFSOutput component, you can write data to HDFS.

Before you begin

Procedure

  1. In the Repository, expand Metadata > Hadoop Cluster, then expand the Hadoop cluster metadata of your choice.
    1. Drag-and-drop the HDFS metadata onto the Designer.
      You are brought to the Components window.
    2. Select a tHDFSOutput component.
  2. Add an input component.

    Example

    Add a tRowGenerator component to generate fictional data for testing purposes (see Generating random data).
  3. Right-click the input component.
    1. Select Row > Main
    2. Click on the tHDFSOutput component to link the two.
  4. Double-click the tHDFSOutput component.

    The component is already configured with the predefined HDFS metadata connection information.

  5. In the File Name field, enter the file path and name of your choice.
  6. Optional: In Action, select Overwrite.

Results

Your input component (such as the tRowGenerator component) reads data and the tHDFSOutput component writes it to your HDFS system using a connection defined using metadata.