Setting patterns

Talend Big Data Platform Getting Started Guide

author
Talend Documentation Team
EnrichVersion
6.4
EnrichProdName
Talend Big Data Platform
task
Design and Development
Installation and Upgrade
Data Quality and Preparation > Cleansing data
Data Quality and Preparation > Profiling data
This column analysis uses predefined patterns to match the content of the Email and Phone columns against standard email and US phone patterns respectively. This defines the content, structure and quality of emails and phone numbers and give a percentage of the data that match the standard formats and the data that does not match.

Before you begin

Procedure

  1. In the Data Preview section in the analysis editor, click the icon next to the Email column to open the [Pattern Selector] dialog box.
  2. Expand Regex > internet, select the Email Address check box and click OK to close the dialog box.

    The pattern is added to the column in the Analyzed Columns section.

  3. Click the icon next to the Phone column to open the [Pattern Selector] dialog box.
  4. Expand Regex > phone, select the US phone numbers check box and click OK to close the dialog box.

    The pattern is added to the column in the Analyzed Columns section.

  5. Click the icon next to the Email Address and US phone numbers patterns and set 98.0 in the Lower threshold (%) fields.

    If the number of the records that match the patterns is fewer than 98%, it will be written in red in the analysis results.