What is a data quality rule? - Cloud

Talend Cloud Data Stewardship User Guide

Version
Cloud
Language
English (United States)
Product
Talend Cloud
Module
Talend Data Stewardship
Content
Administration and Monitoring > Managing users
Data Governance > Assigning tasks
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Handling tasks

A data quality rule is a set of business requirements which helps you detect anomalies in campaigns.

It defines the values your data must comply with. A condition can be added to make the data quality rule apply to some data only.

A data quality rule works as a template:
  1. You create the data quality rule as a standalone object. When you are defining the rule, you can use variables and specific values.

    As data quality rules are generic, the variables allow you to adapt the rule to each data model by associating variables to the attributes of the data model.

    Specific values allow you to use the same value in all data models you applied the rule to.

  2. You apply the data quality rule and adapt it to a data model.

    You associate the variables of the data quality rule with the attributes of the data model.

  3. You create a campaign using the data model.

    Important: Data quality rules can only be used in Resolution and Merging campaigns. The quality bar that allows you to see the results of the data quality rule is not available in the other campaigns.

  4. The data quality rule validates your data by categorizing the values:
    • The values are valid. They fulfill all rule statements.
    • The values are not applicable. They do not fulfill the condition and no alternative validation expression has been defined.
    • The values are invalid. They fulfill the condition but not the validation expression or the rule cannot be executed on those values. For example, if the rule must compare a string with a number. For more information on the errors, click the red vertical bar next to the value.

You can apply the same data quality rule to as many data models as necessary.