Scenario 2: Denormalizing on multiple columns - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario illustrates a Job denormalizing two columns from a delimited file.

  • Drop the following components: tFileInputDelimited, tDenormalize, tLogRow from the Palette to the design workspace.

  • Connect all components using a Row main connection.

  • On the tFileInputDelimited Basic settings panel, set the filepath to the file to be denormalized.

  • Define the Row and Field separators, the Header and other information if required.

  • The file schema is made of four columns including: Name, FirstName, HomeTown, WorkTown.

  • In the tDenormalize component Basic settings, select the columns that contain the repetition. These are the column which are meant to occur multiple times in the document. In this use case, FirstName, HomeCity and WorkCity are the columns against which the denormalization is performed.

  • Add as many line to the table as you need using the plus button. Then select the relevant columns in the drop-down list.

  • In the Delimiter column, define the separator between double quotes, to split concanated values. For FirstName column, type in "#", for HomeCity, type in "§", ans for WorkCity, type in "¤".

  • Save your Job and press F6 to execute it.

  • The result shows the denormalized values concatenated using a comma.

  • Back to the tDenormalize components Basic settings, in the To denormalize table, select the Merge same value check box to remove the duplicate occurrences.

  • Save your Job again and press F6 to execute it.

This time, the console shows the results with no duplicate instances.