Blending data - 7.3

Talend Data Preparation Getting Started Guide

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Preparation
Content
Data Quality and Preparation > Cleansing data
Last publication date
2023-01-05

The Lookup feature allows you to take data from an existing dataset and add it to your preparation.

This example assumes that:

  • You have retrieved the states.csv file from the Downloads tab of the documentation page.
  • You have added states.csv to your list of datasets in Talend Data Preparation. For more information about how to import a dataset, see Opening a dataset from a local file.

In this example, you want to add more geographical information on your customers, thanks to a reference file that you possess: the States dataset. This dataset contains the list of the US State codes, and their corresponding region. You will dynamically use the data from this dataset to complement your preparation. This will allow you to add information about each customer's subscription region, based on their State code.

To blend the data from another dataset in your preparation, proceed as follows:

Procedure

  1. Click the header of the State column to select its content.
  2. Click the Lookup icon in the upper part of the screen.

    The Add data from lookup panel opens at the bottom of the screen.

  3. Click the + icon to select the dataset you want to add.
    The list of previously imported datasets opens. In your case, only States is available.
  4. Select the check box next to States and then click Add.
    The States dataset opens in the bottom part of the screen. You can see that it is only made of two columns, including State that can also be found in your current preparation.
  5. Select the State column in both your preparation and the dataset, so that they appear in blue.
    Your preparation and the dataset can only be linked together if they have a column with information in common, the US State codes in this case.
  6. In the States dataset, select the Add to Dataset check box under the Region column header to add it to your current preparation.
  7. Point your mouse over the Confirm button to preview the changes.
  8. Click the Confirm button to apply the changes and add the Region column to your preparation.

Results

Your data now includes a new information about the subscription region of your customer, that you extracted from a reference file.