Using a reference validation rule - 6.1

Talend Data Fabric Studio User Guide

EnrichVersion
6.1
EnrichProdName
Talend Data Fabric
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

After setting up a validation rule, you can apply it on your Job designs. For example, let's apply a referential integrity check validation rule when updating a database with new data. But before uploading those new data we want to make sure they match the ones already existing in the database. This scenario is based on the reference check validation rule created in Designing a Job. For more information about this rule, see Referential rule.

Setting up the Job environment

  1. From the Palette, drop these components onto the design workspace: a database input component and a database output component, here tMysqlInput and tMysqlOutput, to upload the data, a tLogRow component to display the rejected data in the console and a tJava component to display the number of lines processed in the console.

  2. Connect the input and output database components using a Row > Main link, and connect the tMysqlInput to the tJava components using a Trigger > OnSubjobOk link.

    Note

    You will be able to create the reject link between the tMysqlOutput and tLogRow components only when you will have applied the validation rule to the tMysqlOutput component.

Configuring the components

  1. Double-click the tMysqlInput component to display its Basic settings.

  2. Select Repository as Property type and click the three-dot button next to the field to retrieve the connection properties that corresponds to the metadata you want to check.

  3. Select Repository from the Schema drop down list and click the three-dot button next to the field to retrieve the schema that corresponds to your database table.

  4. Click the three-dot button next to the Table field to select the table to check.

  5. Click Guess Query to automatically retrieve the query corresponding to the table schema.

  6. Double-click the tMysqlOutput component to display its Basic settings.

  7. Select Repository as Property type and click the three-dot button next to the field to retrieve the connection properties that corresponds to the database table in which you want to load the new data.

  8. Click the three-dot button next to the Table field to select the table in which you will load the data.

  9. In the Action on table list, select Default and in the Action on data list, select the action corresponding to the one(s) defined in the validation rule you apply on the Job. Here, as we selected On insert and On update in the referential check validation rule we use, so select Update or insert to trigger the rule.

  10. If the schema of the input and output components did not synchronize automatically, click Sync columns and the schema of the input flow will automatically be retrieved.

Applying the validation rule and viewing the Job execution result

  1. Click the Validation Rules tab in the Component view of tMysqlOutput.

  2. Select the Use an existing validation rule check box to apply the validation rule to the component.

  3. In the Validation Rule Type list, select Repository and click the three-dot button to select the validation rule from the [Repository Content] window.

  4. Right-click tMysqlOutput, select Row> Rejects in the menu and drag to tLogRow to create a reject link between the two components.

    If you have enabled the Reject link option for this validation rule you can retrieve the rejected data to the reject flow.

  5. Double-click the tJava component to display its Basic settings.

  6. In the Code field, type in the code that will display the number of updated, inserted and rejected lines processed:

    System.out.println("Updates: "+((Integer)globalMap.get("tMysqlOutput_1_NB_LINE_UPDATED"))+"\nInserts:
    "+((Integer)globalMap.get("tMysqlOutput_1_NB_LINE_INSERTED"))+"\nRejects:
    "+((Integer)globalMap.get("tLogRow_1_NB_LINE")));

  7. Save your Job and press F6 to execute it.

Valid data is inserted or updated in the database table and the console displays the rows rejected by the validation rule, along with the number of updates, inserts and rejects processed in the Job.