Creating a simple table analysis (Column Set Analysis) - 7.3

Talend Data Fabric Studio User Guide

English (United States)
Data Fabric
Talend Data Fabric
Talend Studio
Design and Development

You can analyze the content of a set of columns. This set can represent only some of the columns in the defined table or the table as a whole.

The analysis of a set of columns focuses on a column set (full records) and not on separate columns as it is the case with the column analysis. The statistics presented in the analysis results (row count, distinct count, unique count and duplicate count) are measured against the values across all the data set and thus do not analyze the values separately within each column.

With the Java engine, you may also apply patterns on each column and the result of the analysis will give the number of records matching all the selected patterns together. For further information, see Adding patterns to the analyzed columns.

Note: When you use the Java engine to run a column set analysis on big sets or on data with many problems, it is advisable to define a maximum memory size threshold to execute the analysis as you may end up with a Java heap error. For more information, see Defining the maximum memory size threshold.