Analyzing the quality of a field - Cloud

Talend Cloud Data Inventory User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Data Inventory
Content
Administration and Monitoring > Managing connections
Data Governance
Data Quality and Preparation > Enriching data
Data Quality and Preparation > Identifying data
Data Quality and Preparation > Managing datasets
Last publication date
2024-01-25

Before you begin

You applied at least one data quality rule to a field.

About this task

In this example, you are using the previously applied data quality rules.

Procedure

  1. Log in as a dataset manager or administrator with the Rules - View permission.
    To have the Rules - View permission, the roles Rule - Manager or Rule - Viewer must be assigned to you.
  2. Open a dataset in the Sample view.
  3. Select a field to which a rule is applied.
    Fields with a rules have the data quality rule icon displayed in their header.
    Data quality rule icon in the field.
  4. In the right panel, you can see the invalid, non applicable, and valid values.
    Quality bar for a data quality rule in the Quality tab.
    Color code for the quality bar
    Color Description
    Red The values are invalid. They fulfill the condition but not the validation expression or the rule cannot be executed on those values. For example, if the rule must compare a string with a number.
    Light green The values are not applicable. They do not fulfill the condition and no alternative validation expression has been defined.
    Green The values are valid. They fulfill all rule statements.
  5. Hover over each color to display the total number and percentage of values.
    Number and percentage of valid values for a data quality rule. Number and percentage of invalid values for a data quality rule.
    Following the example:
    • In the delivery_country field:
      • 193 values are valid. It means that the order status is In Process and the country is correct against the semantic type Country.
      • 1,170 values are not applicable. It means that the order status in not In Process.
      • 137 values are invalid. It means that the order status is In Process but the country is incorrect against the semantic type Country.
    • In the customer_tin field:
      • 589 values are valid. It means that the customer is identified as a company and the TIN is filled in.
      • 744 values are not applicable. It means that the customer is not identified as a company.
      • 167 values are invalid. It means that the customer is identified as a company but the TIN is not filled in.
  6. For more information on each column, hover over the quality bar of the column.
    Quality bar from the column header.
    The quality bar is composed of the results of the column format and data quality rules.
    You can see up to three colors:
    Color code for the quality bars
    Color Description
    Red The values are invalid according to the column format or a data quality rule.
    Gray The cells are empty.
    Green The values are valid according to the column format and data quality rules. The non-applicable values1 from data quality rules are marked as green.

    The values are not applicable when they do not fulfill the condition of the data quality rule, and no alternative validation expression has been defined.