Centralizing a Validation Rule - 6.5

Talend Big Data Studio User Guide

EnrichVersion
6.5
EnrichProdName
Talend Big Data
task
Design and Development
EnrichPlatform
Talend Studio

A validation rule is a basic or integrity rule that you can apply to metadata items to check the validity of your data. It can be basic check for correct values or referential integrity check, both applicable to database tables or individual columns, file metadata or any relevant metadata item.

All your business and validation rules can now be centralized in Repository metadata which will enable you to modify, activate, deactivate and delete them according to your need.

They can be defined either from the Validation Rules metadata entry or directly from the metadata schema or columns you want to check and they are to be used in your Job designs at the component level. Data that did not pass the validation check can easily be retrieved through a reject link for a further treated, if necessary.

To see how to use a validation rule in a Job design, see see Validation rules Job example at https://help.talend.com.

Defining the general properties

To create a validation rule, complete the following:

  1. In the Repository tree view, expand Metadata and right-click Validation Rules, and select Create validation rule from the contextual menu.

    Or

    In the Repository tree view, expand Metadata and expand any metadata item you want to check, either directly right-click the schema of the metadata item or right-click a column of that schema, and select Add validation rule... from the contextual menu.

    For more information about metadata compatible with validation rules, see Selecting the trigger and type of validation .

    The validation rule wizard displays.

  2. Fill in the general information of the metadata such as Name, Purpose and Description. The Status field is a customized field that can be defined. For more information, see Status settings.

  3. Click Next to proceed to the next step.

Selecting the schema to validate

In this step, select the schema or the column(s) you want to check.

  1. In the tree view on the left of the window, select the metadata item you want to check.

  2. In the panel on the right, select the column(s) on which you want to perform the validity check.

    Note

    At least one column must be selected.

  3. Click Next to proceed to the next step.

Selecting the trigger and type of validation

  1. In this step, you can select the action that will trigger the rule:

    • On select,

    • On insert,

    • On update,

    • On delete.

    Note

    Some of the rule trigger options can be disabled according to the type of metadata you checked. For example if the metadata is a file, on update and on delete triggers are not applicable.

    Please refer to the following table for a complete list of supported (enabled) options:

    Metadata item

    On select

    On insert

    On update

    On delete

    Database Table

    Y

    Y

    Y

    Y

    Database View

    Y

    -

    -

    -

    Database Synonym

    Y

    -

    -

    -

    SAP

    Y

    -

    -

    -

    File Delimited

    Y

    Y

    -

    -

    File Positional

    Y

    Y

    -

    -

    File RegEx

    Y

    Y

    -

    -

    File XML

    Y

    Y

    -

    -

    File Excel

    Y

    Y

    -

    -

    File LDIF

    Y

    Y

    -

    -

    LDAP

    Y

    Y

    Y

    Y

    Salesforce

    Y

    Y

    Y

    Y

    Generic Schema

    -

    -

    -

    -

    HL7

    Y

    -

    -

    -

    Talend MDM

    Y

    Y

    Y

    Y

    WSDL

    Y

    -

    -

    -

    Validation rules are not supported for any other metadata that does not display in the above list.

    When you select the On select trigger, the validation rule should be applied to the input components of the Job Designs and when you select the On insert, On update and On delete triggers, the validation rule should be applied to output components.

    And you can select the type of validation you want to perform:

    • a referential integrity validation rule that will check your data against a reference data,

    • a basic restriction validation rule that will check the validity of the values of the selected field(s) with basic criteria,

    • a custom code validation rule allowing you to specify your own Java or SQL based criteria.

  2. Choose to create a referencial rule, a basic rule, or a custom rule.

    Referential rule

    To create a referential integrity check validation rule:

    1. In the Trigger time settings area, select the option corresponding to the action that will trigger the validation. As On insert and On update options are selected here, data will be checked when insert or update action will be performed.

    2. In the Rule type settings area, select the type of validation you want to apply between Reference, Basic Value and Custom check. To check data by reference, select Reference Check.

    3. Click Next.

    4. In this step, select the database schema that will be used as reference.

    5. Click Next.

    6. In the Source Column list, select the column name you want to check and drag it to the Target column against which you want to compare it.

    7. Click Next to define how to handle rejected data.

    Basic rule

    To create a basic check validation rule:

    1. In the Trigger time settings area, select the option corresponding to the action that will trigger the validation. As On Select option is selected here, the check will be performed when data are read.

    2. In the Rule type settings area, select the type of validation you want to apply between Reference, Basic Value and Custom check. To make a basic check of data, select Basic Value Check.

    3. Click Next to go to the next step.

    4. Click the plus button at the bottom of the Conditions table to add as many conditions as required and select between And and Or to combine them. Here, you want to ignore empty Phone number fields, so you added two conditions: retrieve data that are not empty and data that are not null.

    5. Click Next to define how to handle rejected data.

    Custom rule

    To create a custom validation rule:

    1. In the Trigger time settings area, select the option corresponding to the action that will trigger the validation. As On Select option is selected here, the check will be performed when data are read.

    2. In the Rule type settings area, select the type of validation you want to apply between Reference, Basic Value and Custom check. To make a custom check of data, select Custom Check.

    3. Click Next.

    4. In this step, type in your Java condition directly in the text box or click Expression Editor to open the [Expression Builder] that will help you create your Java condition. Use input_row.columnname, where columnname is the name of the column of your schema, to match the input column. In the previous capture, the data will be passed if the value of the idState column is bigger than 0 and smaller than 51. For more information about the Expression Builder, see Working with expressions.

    5. Click Next to define how to handle rejected data.

Handling rejected data

In this step:

  1. Select Disallow the operation and the data that fails to pass the condition will not be outputted.

  2. Select Make rejected data available on REJECT link in job design to retrieve the rejected data in another output.

  3. Click Finish to create the validation rule.

    Once created the validation rule displays:

    • on the Repository under the Metadata > Validation Rules node,

    • under the Validation Rules node of the table you check: