Setting system or user-defined indicators - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29
Available in...

Big Data Platform

Cloud API Services Platform

Cloud Big Data Platform

Cloud Data Fabric

Cloud Data Management Platform

Data Fabric

Data Management Platform

Data Services Platform

MDM Platform

Real-Time Big Data Platform

Before you begin

A column analysis is open in the analysis editor in the Profiling perspective of Talend Studio.

Procedure

  1. From the Data preview section in the analysis editor, click Select indicators to open the Indicator Selection dialog box.
  2. From the Indicator Selection dialog box:
    Note:

    It is useless to use Pattern Frequency Statistics on a column of a Date type in databases when executing the analysis with the SQL engine. No data quality issues are returned by this indicator as all dates will be displayed using one single format.

    If you attach the Date Pattern Frequency to a date column in your analysis, you can generate a date regular expression from the analysis results.

  3. Click OK.
    The selected indicators are attached to the analyzed columns in the Analyzed Columns section.
    The analysis in this example provides/computes the following:
    • Simple statistics on all columns,
    • The characteristics of textual fields, using text statistics indicators, and the number of most frequent values for each distinct record in the indicators,
    • Patterns in the email column to show frequent and rare patterns so that you can identify quality issues more easily, using pattern frequency statistics indicators,
    • The range, the inter quartile range and the mean and median values of the numeric data in the total_sales column, using summary statistics indicators,
    • The frequency of the digits 1 through 9 in the sales figures to detect fraud, using fraud detection indicators.