Scenario 2: Extracting zip codes using DRL rules you create from the Studio - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario is a three-component Job that aims at creating business rules of DRL format from the studio. You can then use these rules to retrieve zip codes for two specific cities you define in the rules.

Creating the DRL rule template

  1. In the Repository tree view, expand Metadata > Rules management.

  2. Right-click Embedded Rules and select Create Rule.

  3. In the open wizard, enter a name for the rule template, fill in its settings as needed and click Next.

  4. Select the Create option and from the Type of rule resource list, select New DRL.

  5. Click Finish.

    A rule template is created and opened in a rule editor in the workspace.

This rule template is embedded in a tRules component. You can define one or several DRL rules in the template from inside the tRules component.

Designing the Job and configuring the input data

  1. Drop a tFixedFlowInput and two tLogRow components from the Palette to the design workspace.

  2. From the Embedded Rules node in the Repository tree view, drop the rule template you created.

    A tRules component with the embedded rule template is displayed on the workspace.

  3. Link tFixedFlowInput to tRule using a Row > Main link.

  4. Double-click tFileInputDelimited to display its Basic settings view and define the component properties.

  5. Click the [...] button next to Edit schema to open the schema editor.

  6. Add two rows using the [+] button, name the rows as zipCode and CityName and click OK.

    When you define the DRL rules, you will use the zipCode column to match the city zip codes and the CityName column to output the name of the city that match the zip code.

    Note

    Make sure to start the column name you will use to match the zip code with lower case, otherwise you will get an error when trying to run the Job.

  7. In the Mode area, select the Use Inline Content (delimited file).

  8. Set the row and field separators, and in the Content table, type in the delimited data on which to apply the DRL rules.

Defining the DRL rules

Setting the rule schemas

  1. In the design workspace, double-click tRules to display its Basic settings view and define the component properties.

    The Property Type is automatically set to Repository as you have already stored the rule template in the Studio.

  2. Click the [...] button to open a dialog box that lists the DRL rules stored locally in the repository.

  3. Select the rule template in which you want to define the rule schemas, ZipCodeRuleSet in this example, and then click OK.

  4. Use the [+] button to add two rows to the Outputs table, click in the Schema column and then click the [...] button.

  5. In the open dialog box, set a name for the first output schema, call it Paris, and click OK.

  6. In the open dialog box, define your output schema. Copy zipCode and CityName from the input flow to the output flow and click OK.

  7. Do the same to create a second output schema, call it Suresnes and similarly copy the two input columns to the output flow.

    Each of the two output schemas will use one of the two DRL rules you will define in the rule template.

  8. Right-click tRules and select Row > Paris to link the component to the first tLogRow.

  9. Do the same and link tRules to the second tLogRow using the Row > Suresnes link.

Creating the DRL rules

  1. In the Outputs table, click in the Rule column and then click the [...] button of the Paris schema.

  2. In the open dialog box, select one of the options as the following:

    Select

    To...

    Edit Rules

    open the rule in the rule editor in the workspace.

    Create a rule with guide

    open a dialog box where you can define a rule in the rule template.

    select a rule from repository

    select a predefined rule from the rule template created and stored in the Repository tree view.

    In this example, select the Create a rule with guide option.

  3. In the open dialog box, use the Drools syntax to set the condition of the "Paris" rule as the following: zipCode matches "75\\d{3}", and then click OK.

    The new "Paris" rule is generated and displayed in the Rule column. This rule retrieves from the Paris schema all zip codes that start with 75 and the three figures that follow.

  4. Click in the Rule column and then click the three-dot button of the "Paris" rule.

    The rule template is opened in the rule editor in the workspace.

  5. In the "Paris" rule, add the code output.CityName = "Paris" to output Paris as the city name in the first output flow.

  6. Repeat the above steps to create a "Suresnes" rule and set its condition as the following zipCode == "92150".

    The new rule is displayed in the Rule column. This rule retrieves from the Suresnes schema all zip codes that are equal to 92150.

  7. In the "Suresnes" rule, add the code output.CityName = "Suresnes" to output Suresnes as the city name in the second output flow.

  8. In the design workspace, double-click each of the tLogRow components to define its properties.

    For more information, see tLogRow.

  9. Save your Job and press F6 to execute it.

    The Run console displays two output flows with zip codes and city names.

    In the first output flow, the "Paris" rule retrieves all zip codes that start with 75 and writes the city name as Paris.

    In the second output flow, the "Suresnes" rule retrieves all zip codes that are equal to 92150 and writes the city name as Suresnes.