Setting patterns - 7.3

Talend Real-Time Big Data Platform Getting Started Guide

Version
7.3
Language
English
Operating system
Real-Time Big Data Platform
Product
Talend Real-Time Big Data Platform
Module
Talend Administration Center
Talend DQ Portal
Talend Installer
Talend Runtime
Talend Studio
Content
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Profiling data
Design and Development
Installation and Upgrade
Last publication date
2023-07-24
This column analysis uses predefined patterns to match the content of the Email and Phone columns against standard email and US phone patterns respectively. This defines the content, structure and quality of emails and phone numbers and give a percentage of the data that match the standard formats and the data that does not match.

Before you begin

  • You have opened the Profiling perspective in the Studio.

  • You have created a column analysis and defined the connection to the database.

Procedure

  1. In the Data Preview section in the analysis editor, click the icon next to the Email column to open the Pattern Selector dialog box.
  2. Expand Regex > internet, select the Email Address check box and click OK to close the dialog box.

    The pattern is added to the column in the Analyzed Columns section.

  3. Click the icon next to the Phone column to open the Pattern Selector dialog box.
  4. Expand Regex > phone, select the US phone numbers check box and click OK to close the dialog box.

    The pattern is added to the column in the Analyzed Columns section.

  5. Click the icon next to the Email Address and US phone numbers patterns and set 98.0 in the Lower threshold (%) fields.

    If the number of the records that match the patterns is fewer than 98%, it will be written in red in the analysis results.