Creating a data quality rule in basic mode - 8.0

Talend Data Stewardship User Guide

Version
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Stewardship
Content
Administration and Monitoring > Managing users
Data Governance > Assigning tasks
Data Governance > Managing campaigns
Data Governance > Managing data models
Data Quality and Preparation > Handling tasks
Data Quality and Preparation > Managing semantic types
Last publication date
2024-02-22

About this task

In this example, you are working at the University. You noticed some datasets have been mixed up and you want to check that the correct scholarship programs have been granted to the correct students.

The data quality rule validates that if the students have the US citizenship and their status code is 2632, then the students have been granted a scholarship program taking effect on September 1st, 2021 and whose code ends with 10AB or 10AC.

Here is a sample of the dataset:

Procedure

  1. Log in as a rule manager.
  2. In the left panel, click Data quality rules > Add rule.
  3. Enter the name: ScholarshipProgram.
  4. Enter the description: Checking the status code 2632.
    The description is optional. It helps you find a rule when the rule names are similar.
  5. In the If part, click Add a row:
    1. Select Variable and enter the name StatusCode.
      The supported characters are [a-z], [A-Z], [0-9] and special characters: _ . @ $ #.
    Note: Data quality rules are templates. You will associate the variables with attributes when applying the rule to a data model.
    1. Select the operator is equal to.
      For more information on the operators, see the list and examples.
    2. Select Value and enter 2632.
    3. Add a row to add the subcondition: Citizenship is US.
    4. Select the logical operator And.
  6. In the Then part, add three rows:
    1. To group the first two rows, hover over the rows and select the check boxes on the right.
    2. Click Group in the actions bar.
      For more information on the actions, see Managing the rows.
    3. Select the logical operator And.
    4. Define all fields to validate that the students have been granted a scholarship program taking effect on September 1st, 2021 and whose code ends with 10AB or 10AC.
    The Else part allows you to define an alternative when the condition (If) cannot be fulfilled.

    For this example, leave the Else part empty. The values that do not fulfill the condition will be categorized as non-applicable values.

    The data quality rule is defined as follows:
  7. Click Save.

What to do next

You can now apply the data quality rule to data models.