Executing the analysis and displaying the profiling results - 7.0

Data Quality Job and Analysis Examples

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Open Studio for Data Quality
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Quality and Preparation
EnrichPlatform
Talend Studio

Procedure

  1. Save the column analysis in the analysis editor and then press F6 to execute it.
    A group of graphics is displayed in the Graphics panel to the right of the analysis editor showing the results of the column analysis including those for pattern matching.
  2. Click the Analysis Results tab at the bottom of the analysis editor to access a more detail result view.
    These results show the generated graphics for the analyzed columns accompanied with tables that detail the statistic and pattern matching results.

Results

The pattern matching results show that about 10% of the email records do not match the standard email pattern. The simple statistic results show that about 8% of the email records are blank and that about 5% are duplicates. And the pattern frequency results give the number of most frequent records for each distinct pattern. This shows that the data is not consistent and you need to correct and cleans the email data before starting your campaign.

The results for the postal column look as the following:

The result sets for the postal column give the count of the records that match and those that do not match a standard US zip code format. The results sets also give the blank and duplicate counts and the number of most frequent records for each distinct pattern. These results show that the data is not very consistent.

Then some percentage of the customers can not be contacted by either email or US mail service. These results show clearly that your data is not very consistent and that it needs to be corrected.