Reading a file in Talend Studio - 8.0

Version
8.0
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs

Reading a file in Talend Studio

Talend Studio allows you to easily access your data with a wide array of components. In this tutorial, learn how to access data in a standard comma-separated file format.

Creating a Talend Studio project

Creating a project is the first step to using Talend Studio. Projects allow you to better organize your work.

Procedure

  1. Select Create a new project.
  2. Enter a name for your project.

    Example

    TalendDemo
  3. Click Create.
  4. Click Finish.

Results

Your project opens. You are ready to work in Talend Studio.

Creating a Job to read a delimited file

Talend Studio projects contain Jobs. In Jobs, you can build workflows through components, which allow you to complete specific actions.

Before you begin

Select the Integration perspective (Window > Perspective > Integration).

Procedure

  1. In Repository, right-click Job Designs.
    1. Click Create Standard Job.
  2. In the Name field, enter a name.

    Example

    readCSV
  3. Optional: In the Purpose field, enter a purpose.

    Example

    Read a .csv file
  4. Optional: In the Description field, enter a description.

    Example

    This tutorial uses a component to read a .csv file
    Tip: Enter a Purpose and Description to stay organized.
  5. Click Finish.

Results

The Designer opens an empty Job.

Configuring a component to read a delimited file

Talend Studio components allow you to complete specific actions. You can add them to Jobs. You can use the tFileInputDelimited component to read a delimited file, for example.

Before you begin

This tutorial makes use of a .csv file. If you do not have a .csv file, click the Downloads tab and save customers_unordered.csv.

Procedure

  1. Click inside the Designer.
  2. Enter tFileInputDelimited and select the component of the same name.
  3. In the Designer, double-click the tFileInputDelimited component.
    1. Click the […] button next to the File Name/Stream field.
    2. Select the file of your choice in the File Explorer.
    3. Optional: Check your file's Field Separator and change it, if needed.
      Note: The most common Field Separator is ;

Results

You have added a tFileInputDelimited component and selected a file to be read.

Defining a component schema to read a delimited file

Defining the component schema of your delimited file helps you parse the data you are working with.

Before you begin

You must have added and configured a tFileInputDelimited component (see Configuring a component to read a delimited file).

Procedure

  1. In the Designer, double-click the tFileInputDelimited component.
  2. Click the […] button next to Edit schema.
    The Schema wizard opens.
  3. Click the plus button to add a Column.
    1. Add as many columns as there are headers in your .csv file.
      Note: Headers are the first values in a .csv file.
    2. Enter the name of each Column.
      Column names must be identical to header names.

      Example

      • First
      • Last
      • Number
      • Street
      • City
      • State
    3. Select the Type of each Column.
      Tip: Select the String Type for a postcode. A postcode number does not serve an arithmetic function.
  4. Click OK.

Results

You have defined the schema of your file.

Reading a delimited file and displaying its content in the console

You can display the result of a workflow with a link to a tLogRow component. The tLogRow component displays data in the Run console.

Before you begin

Procedure

  1. In the Designer, add a tLogRow component.
  2. Right-click the tFileInputDelimited component.
    1. Select Row > Main
    2. Click on the tLogRow component to link the two.
  3. In the Run view, click Run.

Results

The tFileInputDelimited component reads your delimited file and the tLogRow component displays its content in the console.