Advanced statistics - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

They determine the most probable and the most frequent values and builds frequency tables. The main advanced statistics include the following values:

  • Mode: computes the most probable value. For numerical data or continuous data, you can set bins in the parameters of this indicator. It is different from the "average" and the "median". It is good for addressing categorical attributes.
  • Value Frequency: computes the number of most frequent values for each distinct record.
  • All other Value Frequency indicators are available to aggregate date and numerical data (with respect to "date", "week", "month", "quarter", "year" and "bin").
  • Value Low Frequency: computes the number of less frequent records for each distinct record.
  • All other Value Low Frequency indicators are available to aggregate date and numerical data (with respect to "date", "week", "month", "quarter", "year" and "bin"), where "bin" is the aggregation of numerical data by intervals.

The following table shows the indicators that you can select in any database:

Data type Number Text Date Others
Analysis engine type Java SQL Java SQL Java SQL Java SQL
Mode
Value (Low) Frequency
Date (Low) Frequency * *
Week (Low) Frequency * *
Month (Low) Frequency * *
Quarter (Low) Frequency * *
Year (Low) Frequency * *
Bin (Low) Frequency
* Except for the time data type