Available in...Big Data Platform
Cloud API Services Platform
Cloud Big Data Platform
Cloud Data Fabric
Cloud Data Management Platform
Data Fabric
Data Management Platform
Data Services Platform
MDM Platform
Real-Time Big Data Platform
Before you begin
A column analysis is open in the analysis editor in the
Profiling
perspective of
Talend Studio.
Procedure
-
From the Data preview view in the analysis editor, click
Select indicators to open the Indicator Selection dialog box.
-
From the Indicator Selection dialog box:
Note:
It is useless to use Pattern Frequency Statistics on a column of a Date
type in databases when executing the analysis with the SQL engine. No
data quality issues are returned by this indicator as all dates will be
displayed using one single format.
If you attach the Date Pattern
Frequency to a date column in your analysis, you can
generate a date regular expression from the analysis results.
-
Click OK.
The selected indicators are attached to the analyzed columns in the Analyzed Columns view.
The analysis in this example provides/computes the following:
- Simple statistics on all columns,
- The characteristics of textual fields, using text statistics indicators, and
the number of most frequent values for each distinct record in the
indicators,
- Patterns in the email column to show frequent and rare
patterns so that you can identify quality issues more easily, using pattern
frequency statistics indicators,
- The range, the inter quartile range and the mean and median values of the
numeric data in the total_sales column, using summary
statistics indicators,
- The frequency of the digits 1 through 9 in the sales figures to detect fraud,
using fraud detection indicators.