Exploring the analysis results

Talend Data Management Platform Studio User Guide

EnrichVersion
6.2
EnrichProdName
Talend Data Management Platform
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Prerequisite(s): A numerical correlation analysis is defined and executed in the Profiling perspective of the studio.

In the Analysis Results view of the analysis editor:

  • Click Graphics, Simple Statistics or Data to show the generated graphic, the number of the analyzed records or the actual analyzed data respectively.

In the Graphics view, the data plotted in the bubble chart have different colors with the legend pointing out which color refers to which data.

The more the bubble is near the left axis the less confident we are in the average of the numeric column. For the selected bubble in the above example, the company name is missing and there are only two data records, hence the bubble is near the left axis. We cannot be confident about age average with only two records. When looking for data quality issues, these bubbles could indicate problematic values.

The bubbles near the top of the chart and those near the bottom of the chart may suggest data quality issues too, too big or too small age average in the above example.

From the generated graphic, you can:

  • clear the check box of the value(s) you want to hide in the bubble chart,

  • place the pointer on any of the bubbles to see the correlated data values at that position,

  • right-click any of the bubbles and select:

    Option

    To...

    Show in full screen

    open the generated graphic in a full screen

    View rows

    access a list of all analyzed rows in the selected column

    The below figure illustrates an example of the SQL editor listing the correlated data values at the selected position.

    From the SQL editor, you can save the executed query and list it under the Libraries > Source Files folders in the DQ Repository tree view if you click the save icon on the editor toolbar. For more information, see Saving the queries executed on indicators.

The Simple Statistics view shows the number of the analyzed records falling in certain categories, including the number of rows, the number of distinct and unique values and the number of duplicates.

The Data view displays the actual analyzed data.

You can sort the data listed in the result table by simply clicking any column header in the table.