Creating a simple table analysis (Column Set Analysis) - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

You can analyze the content of a set of columns. This set can represent only some of the columns in the defined table or the table as a whole.

The analysis of a set of columns focuses on a column set (full records) and not on separate columns as it is the case with the column analysis. The statistics presented in the analysis results (row count, distinct count, unique count and duplicate count) are measured against the values across all the dataset and thus do not analyze the values separately within each column.

With the Java engine, you may also apply patterns on each column and the result of the analysis will give the number of records matching all the selected patterns together.

Note: When you use the Java engine to run a column set analysis on big sets or on data with many problems, it is advisable to define a maximum memory size threshold to execute the analysis as you may end up with a Java heap error. For more information, see Defining the maximum memory size threshold.