Identifying anomalies in data - 6.5

Talend Open Studio for Data Quality Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.5
EnrichProdName
Talend Open Studio for Data Quality
task
Data Quality and Preparation > Profiling data
Design and Development
Installation and Upgrade
EnrichPlatform
Talend Studio

The use case explains how to use the Profiling perspective of the studio to analyze customer email addresses and phone numbers. It uses out-of-box indicators and patterns on the columns and shows the matching and non-matching address data.

You can then use the Data Explorer perspective to browse the non-matching data.

The sequence of profiling customer data involves the following steps:

Procedure

  1. Create a column analysis on customer email addresses and phone numbers. For further information, see Defining a column analysis.
  2. Connect to the database which holds the customer data from the analysis editor. For further information, see Creating the database connection.
  3. Add indicators to provide simple statistics on data such as row , blank and duplicate counts. For further information, see Setting system indicators.
  4. Add standard patterns against which to match email addresses and phone numbers. For further information, see Setting patterns.
  5. Execute the analysis to show results in tables and charts. For further information, see Showing analysis results.
  6. Access a view of the analyzed data to see invalid records. For further information, see Browsing non-match data.