Setting patterns - 7.0

Data Quality Job and Analysis Examples

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Open Studio for Data Quality
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Quality and Preparation
EnrichPlatform
Talend Studio

You would want now to match the content of the email column against a standard email format and the postal column against a standard US zip code format.

This will define the content, structure and quality of emails and zip codes and give a percentage of the data that match the standard formats and the data that does not match.

Procedure

  1. In the Analyzed Columns view, click the icon next to email.
  2. In the Pattern Selector dialog box, expand Regex and browse to Email Address in the internet folder, and then click OK.
  3. Click the option icon next to the Email Address indicator and set 98.0 in the Lower threshold (%) field.
    If the number of the records that match the pattern is fewer than 98%, it will be written in red in the analysis results.
  4. Do the same to add to the postal column the US Zipcode Validation pattern from the address folder.

    For further information on pattern types and their usage when analyzing data, see Talend Studio User Guide.