Creating analyses from table or column names - 6.2

Talend Data Fabric Studio User Guide

EnrichVersion
6.2
EnrichProdName
Talend Data Fabric
task
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

In the studio, you can use simplified ways to create one or multiple column analyses. All what you need to do is to start from the table name or the column name under the relevant DB Connection folder in the DQ Repository tree view.

However, the options you have to create column analyses if you start from the table name are different from those you have if you start from the column name.

To create a column analysis directly from the relevant table name in the DB Connection, do the following:

  1. In the DQ Repository tree view, expand Metadata > DB Connections.

  2. Browse to the table that holds the column(s) you want to analyze and right-click it.

  3. From the contextual menu, select:

    Item

    To...

    Semantic-aware Analysis

    analyze the selected table based on information gathered in the semantic repository.

    For further information, see Semantic-aware Analysis.

    Match Analysis

    open the match analysis editor where you can define match rules and select the columns on which you want to use the match rules.

    For more information see Analyzing duplicates.

    Table Analysis

    analyze the selected table using SQL business rules.

    For more information on the Simple Statistics indicators, see Simple statistics.

    Column Analysis

    analyze all the columns included in the selected table using the Simple Statistics indicators.

    For more information on the Simple Statistics indicators, see Simple statistics.

    Pattern Frequency Analysis

    analyze all the columns included in the selected table using the Pattern Frequency Statistics along with the Row Count and the Null Count indicators.

    For more information on the Pattern Frequency Statistics, see Pattern frequency statistics.

The above steps replace the procedures outlined in Defining the columns to be analyzed and setting indicators. Then you proceed following the steps outlined in Finalizing and executing the column analysis.

To create a column analysis directly from the column name in the DB Connection, do the following:

  1. In the DQ Repository tree view, expand Metadata > DB Connections.

  2. Browse to the column(s) you want to analyze and right-click it/them.

  3. From the contextual menu, select:

    Item

    To...

    Analyze

    create an analysis for the selected column

    you must later set the indicators you want to use to analyze the selected column.

    For more information on setting indicators, see How to set indicators on columns. For more information on accomplishing the column analysis, see Finalizing and executing the column analysis.

    Nominal Value Analysis

    create a column analysis on nominal data preconfigured with indicators appropriate for nominal data, namely Value Frequency, Simple Statisticsand Text Statistics indicators.

    Simple Analysis

    analyze the selected column using the Simple Statistics indicators.

    For more information on the Simple Statistics indicators, see Simple statistics.

    Pattern Frequency Analysis

    analyze the selected column using the Pattern Frequency Statistics along with the Row Count and the Null Count indicators.

    For more information on the Pattern Frequency Statistics, see Pattern frequency statistics.

    Analyze Column Set

    perform an analysis on the content of a set of columns. This analysis focuses on a column set (full records) and not on separate columns as it is the case with the column analysis.

    For more information, see Creating a simple table analysis (Column Set Analysis).

    Analyze Correlation

    perform column correlation analyses between nominal and interval columns or nominal and date columns in database tables.

    For more information, see Numerical correlation analyses.

    Semantic-aware Analysis

    analyze the selected column(s) based on information gathered in the semantic repository.

    For further information, see Semantic-aware Analysis.

    Analyze matches

    open the match analysis editor where you can define match rules and select the columns on which you want to use the match rules.

    For more information see Analyzing duplicates.

The above steps replace one of or both of the procedures outlined in Defining the columns to be analyzed and setting indicators. Now, you proceed following the same steps outlined in Finalizing and executing the column analysis.