Denormalizing on multiple columns - 7.3

Processing (Integration)

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Processing components (Integration)
Data Quality and Preparation > Third-party systems > Processing components (Integration)
Design and Development > Third-party systems > Processing components (Integration)
Last publication date
2024-02-21

Procedure

  1. Drop the following components: tFileInputDelimited, tDenormalize, tLogRow from the Palette to the design workspace.
  2. Connect all components using a Row main connection.
  3. On the tFileInputDelimited Basic settings panel, set the filepath to the file to be denormalized.
  4. Define the Row and Field separators, the Header and other information if required.
  5. The file schema is made of four columns including: Name, FirstName, HomeTown, WorkTown.
  6. In the tDenormalize component Basic settings, select the columns that contain the repetition. These are the column which are meant to occur multiple times in the document. In this use case, FirstName, HomeCity and WorkCity are the columns against which the denormalization is performed.
  7. Add as many line to the table as you need using the plus button. Then select the relevant columns in the drop-down list.
  8. In the Delimiter column, define the separator between double quotes, to split concanated values. For FirstName column, type in "#", for HomeCity, type in "§", ans for WorkCity, type in "¤".
  9. Save your Job and press F6 to execute it.
    The result shows the denormalized values concatenated using a comma.
  10. Back to the tDenormalize components Basic settings, in the To denormalize table, select the Merge same value check box to remove the duplicate occurrences.
  11. Save your Job again and press F6 to execute it..

Results

This time, the console shows the results with no duplicate instances.