Skip to main content Skip to complementary content
Close announcements banner

Writing data to HDFS using metadata

Using the tHDFSOutput component, you can write data to HDFS.

Before you begin

Procedure

  1. In the Repository, expand Metadata > Hadoop Cluster, then expand the Hadoop cluster metadata of your choice.
    1. Drag-and-drop the HDFS metadata onto the Designer.
      You are brought to the Components window.
    2. Select a tHDFSOutput component.
  2. Add an input component.

    Example

    Add a tRowGenerator component to generate fictional data for testing purposes (see Generating random data).
  3. Right-click the input component.
    1. Select Row > Main
    2. Click on the tHDFSOutput component to link the two.
  4. Double-click the tHDFSOutput component.

    The component is already configured with the predefined HDFS metadata connection information.

  5. In the File Name field, enter the file path and name of your choice.
  6. Optional: In Action, select Overwrite.

Results

Your input component (such as the tRowGenerator component) reads data and the tHDFSOutput component writes it to your HDFS system using a connection defined using metadata.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!