Setting system indicators - 7.3

Talend Real-Time Big Data Platform Getting Started Guide

Version
7.3
Language
English
Operating system
Real-Time Big Data Platform
Product
Talend Real-Time Big Data Platform
Module
Talend Administration Center
Talend DQ Portal
Talend Installer
Talend Runtime
Talend Studio
Content
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Profiling data
Design and Development
Installation and Upgrade
Last publication date
2023-07-24
This column analysis uses out-of-box indicators to provide simple statistics such as row, blank and duplicate counts on the Email and Phone columns.

Before you begin

  • You have opened the Profiling perspective in the Studio.

  • You have created a column analysis and defined the connection to the database.

Procedure

  1. In the Data Preview section in the analysis editor, click Select indicators to open the Indicator Selection dialog box.
  2. Expand Simple Statistics and select Row Count, Blank Count and Duplicate Count. Click OK to close the wizard.

    You want to see the row, blank and duplicate counts in the Email and Phone columns to see how consistent the data is.

    Indicators are added accordingly to the columns in the Analyzed Columns section.

  3. Click the icon next to the Duplicate Count and Blank Count indicator and set 0 in the Upper threshold field.

    Defining thresholds on the Email and Phone columns is very helpful as it will write in red the count of the duplicate and blank values in the analysis results.