Skip to main content

What is a data quality rule?

A data quality rule is a set of business requirements which helps you detect anomalies in datasets.

It defines the values your data must comply with. A condition can be added to make the data quality rule apply to some data only.

A data quality rule works as a template:
  1. You create the data quality rule as a standalone object. When you are defining the rule, you can use variables and specific values.

    As data quality rules are generic, the variables let you adapt the rule to each dataset by associating variables to the fields of the dataset.

    Specific values let you use the same value in all datasets to which you applied the rule.

  2. You apply the data quality rule and adapt it to a field.

    You associate the variables of the data quality rule with the fields. You can apply a rule to a field to validate data from other fields.

  3. The data quality rule validates your data by categorizing the values:
    • The values are valid. They fulfill all rule statements.
    • The values are not applicable. They do not fulfill the condition and no alternative validation expression has been defined.
    • The values are invalid. They fulfill the condition but not the validation expression or the rule cannot be executed on those values. For example, if the rule must compare a string with a number.
You can apply the same data quality rule to as many fields as necessary.

The data quality rules have effects on the quality of your dataset and the Talend Trust Score™.

For more information on the effects on the quality of one dataset, see: For more information on the effects on the global quality in the Data console tab, see:

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!