Denormalizing on one column - Cloud - 8.0

Processing (Integration)

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Processing components (Integration)
Data Quality and Preparation > Third-party systems > Processing components (Integration)
Design and Development > Third-party systems > Processing components (Integration)

Procedure

  1. Drop the following components: tFileInputDelimited, tDenormalize, tLogRow from the Palette to the design workspace.
  2. Connect the components using Row main connections.
  3. On the tFileInputDelimited Component view, set the filepath to the file to be denormalized.
  4. Define the Header, Row Separator and Field Separator parameters.
  5. The input file schema is made of two columns, Parents and Children.
    Parents;Children
    Peter;John
    William;Mary
    Kate;Jack
    Chris;Liz
    Peter;Michael
    Kate;Caroline
  6. In the Basic settings of tDenormalize, define the column that contains multiple values to be grouped.
  7. In this example, the column to denormalize is Children.
  8. Set the Delimiter to separate the grouped values.
  9. Select the Merge same value check box, if you know that some values to be grouped are strictly identical.
  10. Save your Job and press F6 to execute it.

Results

All values from the column Children are grouped by their Parents column.

|Parents|Children     |
|=------+------------=|
|Kate   |Jack;Caroline|
|Chris  |Liz          |
|Peter  |John;Michael |
|William|Mary