Removing non-matching values - 7.3

Data Quality Job and Analysis Examples

Version
7.3
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Open Studio for Data Quality
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Quality and Preparation
Last publication date
2023-03-01
The email pattern used on the email column showed that some records do not respect the standard email format. You can generate a ready-to-use Job to recuperate the non-matching rows from the column.

Procedure

  1. In the Profiling perspective, click the Analysis Results tab at the bottom of the editor.
  2. In the Pattern Matching results of the email column, right-click the chart bar or the numerical results and select Generate Job.

    The Integration perspective opens showing the generated Job.

    This Job uses the Extract Transform Load process to write in two separate output files the valid/invalid email rows that match/do not match the pattern.

  3. Save the Job and press F6 to execute it.

Results

The valid and invalid rows of the email column are written in the defined output files.

You can replace the output files with different Talend components and recuperate the valid/invalid email rows and write them in databases for example.

For more information on using the Profiling perspective to identify and remove corrupt, incomplete, or inaccurate data, see Data cleansing in the Talend Studio User Guide.