Scenario 1: Denormalizing on one column - 6.3

Talend Open Studio for Big Data Components Reference Guide

Talend Open Studio for Big Data
Data Governance
Data Quality and Preparation
Design and Development
Talend Studio

This scenario illustrates a Job denormalizing one column in a delimited file.

  • Drop the following components: tFileInputDelimited, tDenormalize, tLogRow from the Palette to the design workspace.

  • Connect the components using Row main connections.

  • On the tFileInputDelimited Component view, set the filepath to the file to be denormalized.

  • Define the Header, Row Separator and Field Separator parameters.

  • The input file schema is made of two columns, Fathers and Children.

  • In the Basic settings of tDenormalize, define the column that contains multiple values to be grouped.

  • In this use case, the column to denormalize is Children.

  • Set the Delimiter to separate the grouped values. Beware as only one column can be denormalized.

  • Select the Merge same value check box, if you know that some values to be grouped are strictly identical.

  • Save your Job and press F6 to execute it.

All values from the column Children (set as column to denormalize) are grouped by their Fathers column. Values are separated by a comma.