Joining two data sources with the tMap component in Talend Studio - 8.0

Version
8.0
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs

Joining two data sources with the tMap component in Talend Studio

In this tutorial, discover how to join two data sources with the tMap component in Talend Studio.

This tutorial makes use of a .csv file. If you do not have a .csv file, click the Downloads tab and save movies.csv.

This tutorial also makes use of another delimited file. If you do not have another delimited file, click the Downloads tab and save directors.txt.

Creating a Talend Studio project

Creating a project is the first step to using Talend Studio. Projects allow you to better organize your work.

Procedure

  1. Select Create a new project.
  2. Enter a name for your project.

    Example

    TalendDemo
  3. Click Create.
  4. Click Finish.

Results

Your project opens. You are ready to work in Talend Studio.

Creating a Job to join data sources

Talend Studio projects contain Jobs. In Jobs, you can build workflows through components, which allow you to complete specific actions.

Before you begin

Select the Integration perspective (Window > Perspective > Integration).

Procedure

  1. In Repository, right-click Job Designs.
    1. Click Create Standard Job.
  2. In the Name field, enter a name.

    Example

    tMapJoin
  3. Optional: In the Purpose field, enter a purpose.

    Example

    Joining two different data sources in Talend Studio
  4. Optional: In the Description field, enter a description.

    Example

    Using the tMap component to turn two different data sources into one
    Tip: Enter a Purpose and Description to stay organized.
  5. Click Finish.

Results

The Designer opens an empty Job.

Data joining using the tMap component

The tMap component allows you to transform and route data from single or multiple sources to single or multiple destinations.

Creating a metadata definition for the tMap component

Creating a metadata definition allows you to set up reusable information across all of your components.

Before you begin

This tutorial makes use of a delimited file. If you do not have a delimited file, click the the Downloads tab and save directors.txt.

Procedure

  1. In the Repository, expand Metadata then right-click File delimited and click Create file delimited.
  2. In the Name field, enter a name.

    Example

    directors
  3. Optional: In the Purpose field, enter a purpose.

    Example

    Joining the directors data to the movies database
  4. Optional: In the Description field, enter a description.

    Example

    Reusable shareable directors metadata
    Tip: Enter a Purpose and Description to stay organized.
  5. Click Next.
  6. Click Browse, select the file of your choice in the File Explorer.
  7. Optional: Define the parse settings.

    Example

    • Under File Settings, select your Field Separator and change it, if needed.
      Note: The most common Field Separator is ;
    Tip: Under Preview, click Refresh Preview to check the parsing results.
  8. Click Next.
  9. Optional: In the Name field, enter a name.

    Example

    directorsSchema
  10. Update the Schema so it is identical to the structure of the sample file.

    Example

    • Change the name of Column0 to directorID and the name of Column1 to directorName.
    • Change the Length of directorID to 4 and the Length of directorName to 40.
  11. Click Finish.

Results

In the Repository, under Metadata, you can find and use your metadata.

Configuring a tMap component to join two data sources

The tMap component allows you to transform and route data from single or multiple sources to single or multiple destinations. In this case, discover how to join two data sources.

About this task

For the sake of demonstration, this tutorial uses two different metadata definitions: movies 0.1 and directors 0.1. To follow this tutorial, you can:
  1. Click the Downloads tab and save metadata_movies_directors.zip.
  2. In the Repository, expand Metadata then right-click File delimited and click Import items.
  3. Select Select archive file: then click Browse to select metadata_movies_directors.zip.
  4. Select movies 0.1 and directors 0.1.
  5. Click Finish.

You can also learn how to create both metadata definitions (see Creating a metadata definition for the tMap component).

Procedure

  1. Drag-and-drop the movies 0.1 and directors 0.1 metadata on the Designer.
    1. In both cases, select a tFileInputDelimited component.
  2. Add a tMap component.
  3. Right-click the movies component.
    1. Select Row > Main
    2. Click on the tMap component to link the two.
  4. Repeat the three previous steps for the directors component.
  5. Double-click the tMap component.
    You are brought to the tMap component configuration window.
  6. On the right side of the screen, click Add output table.
  7. Enter a name for your output table.

    Example

    joinedOutput
  8. Click OK.
  9. In input table row1, select columns movieID, title, releaseYear, url then drag-and-drop them in output table JoinedOutput.