Advanced statistics - 7.1

Talend Real-time Big Data Platform Studio User Guide

Version
7.1
Language
English (United States)
Product
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development

They determine the most probable and the most frequent values and builds frequency tables. The main advanced statistics include the following values:

  • Mode: computes the most probable value. For numerical data or continuous data, you can set bins in the parameters of this indicator. It is different from the "average" and the "median". It is good for addressing categorical attributes.
  • Value Frequency: computes the number of most frequent values for each distinct record.
  • All other Value Frequency indicators are available to aggregate date and numerical data (with respect to "date", "week", "month", "quarter", "year" and "bin").
  • Value Low Frequency: computes the number of less frequent records for each distinct record.
  • All other Value Low Frequency indicators are available to aggregate date and numerical data (with respect to "date", "week", "month", "quarter", "year" and "bin"), where "bin" is the aggregation of numerical data by intervals.