Scenario 1: Extracting client data according to business rules stored in an external file - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario is a four-component Job that aims at reading client data and retrieving only the clients that match business rules stored in an external Drools file.

Prerequisites

For this example, you must have a Drools file of .xls or .drl format that holds the business rules you will use in the Job.

In this example, the business rules are defined in an Excel file as the following:

  • The Import field (cell C:2) must respect the following format <projectname>.<lowercase jobname>_0_1.<jobname>.*. For example, dq_project.business_rule_0_1.Business_Rule.* means that the name of the project in the studio is dq_project and the name of the Job is Business_Rule.

    Make sure to define in this field the exact project and Job names you have in your studio.

  • the RuleAGE rule retrieves all clients whose age is between 30 and 39 and writes them to the first output flow.

  • the RuleREGION rule retrieves all clients who live in the EMEA region and writes them to the second output flow.

    In the output schema of the tRules component, make sure to use the exact names of the rules defined in the Excel Drools file.

Designing the Job and configuring the input data

  1. Drop the following components from the Palette to the design workspace: tFileInputExcel, tRules and two tLogRow.

  2. Double-click tFileInputExcel to display its Basic settings view and define the component properties.

  3. In the Property type list, select Built-in and fill in the fields that follow.

  4. Click the three-dot button next to the File Name/Stream field and browse to the source file to set its path and name. The source file used in this example is called client and it holds client data.

    If needed, right-click tFileInputExcel and select Data viewer to have a preview of the input data.

  5. Select the All sheets check box to retrieve the data from all sheets of the excel file.

  6. From the Schema list, select Built-in and click Edit schema to open a dialog box where you can define the schema of the input file.

    In this example, the source file holds four columns: id, name, age and region.

  7. Click OK to validate your changes and close the dialog box.

Defining the business rules

Setting the rule schemas

  1. In the design workspace, double-click tRules to display its Basic settings view and define the component properties.

  2. In the Property type list, select Repository if you have stored the file that holds the business rules in the Metadata > Rules managementnode of the Repository tree view. If not, select Built-in and browse to the Drools file.

    For more information on how to create and store business rules, see Talend Studio User Guide.

  3. In the Outputs table, click the plus button to add two rows that represent two different output flows, each using one of the two business rules defined in the Drools file.

  4. Click in the first row of the Schema column to display a three-dot button. Click the button to open a dialog box and set a name for the output schema.

  5. Enter the exact name of the first rule as it is written in the Drools file, RuleAGE in this example, and then click OK.

    A dialog box opens.

  6. Define your output schema. In this example, we want to recuperate the input schema. Click OK to close the dialog box.

  7. Do the same in the second line of the Schema column.

    Enter the exact name of the second rule as it is written in the Drools file RuleREGION to have it as the name of the second output schema, and then recuperate the input schema in the open dialog box.

    You will have an error message when trying to execute the Job if the name of the output schemas in your Job do not match the exact names of the rules in the Drools file.

Selecting the rules

  1. Click in the first line of the Rule column to display a three-dot button. Click the button to open the [Rule] dialog box.

  2. Select the option check box that corresponds to your needs:

    • View Rules: to open the business rule file in read-only mode, or

    • Select a rule from repository: to select the relevant predefined rule from the business rules file stored in the Repository tree view.

  3. In the Rule list, select the rule you want to apply to the first output schema, RuleAGE, and then click OK.

    The selected rule displays in the Rules column.

    In this example, we want to apply RuleAGE to the first output schema and RuleREGION to the second output schema.

  4. Do the same to select the rule for the second output schema, RuleREGION and then click OK.

  5. In the design workspace, double-click each of the tLogRow components to define its properties. For more information, see tLogRow.

  6. Save your Job and press F6 to execute it.

    The Run console displays the two output flows: the first output flow lists all clients whose age is between 30 and 39, and the second output flow lists all clients who live in the EMEA region.