Using context variables to filter different data at execution time - Cloud

Talend Cloud Pipeline Designer User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Pipeline Designer
Content
Administration and Monitoring > Monitoring executions
Administration and Monitoring > Monitoring logs
Data Governance > Filtering data
Data Quality and Preparation > Filtering data
Data Quality and Preparation > Managing datasets
Deployment > Deploying > Executing Pipelines
Design and Development > Designing Pipelines
Last publication date
2024-02-09

In this scenario, a context variable is added to override the value used to filter user data at execution time.

A pipeline named 'Filter user data with context variables' shows a Test dataset as the pipeline source, a Filter processor with context variables, and another Test dataset as the pipeline destination.

Before you begin

  • You have previously created a connection to the system storing your source data, here a Test connection.

  • You have previously added the dataset holding your source data.

    Here, data about user information including names, company, email, account balance, etc. For more information, see Creating a test dataset.

  • You also have created the destination test dataset that will store the log output.

Procedure

  1. Click Add pipeline on the Pipelines page. Your new pipeline opens.
  2. Give the pipeline a meaningful name.

    Example

    Filter user data with context variables
  3. Click ADD SOURCE to open the panel allowing you to select your source data, here user data.
  4. Select your dataset and click Select in order to add it to the pipeline.
    Rename it if needed.
  5. Click Plus and add a Filter processor to the pipeline. The Configuration panel opens.
  6. Give a meaningful name to the processor; filter on balances >= $3,000 for example.
  7. In the Filter area:
    1. Select .balance in the Input area, as you want to filter the records corresponding to the user account balances.
    2. Select None in the Optionally select a function to apply list, >= in the Operator list and type in $3,000 in the Value list as you want to filter on users with an account balance superior or equal to 3000 dollars.
  8. Click Save to save your configuration.

    You can see that the records are filtered and only 4 records meet the criteria you have defined:

    The preview panel shows the input data before the filtering operation, and the output data after the filtering operation.
  9. Click the ADD DESTINATION item on the pipeline to open the panel allowing to select the dataset that will hold your filtered data.
  10. Give a meaningful name to the Destination; log output for example.
  11. In the Configuration tab of the Destination dataset, enable the Log records to STDOUT option in order to print the read records in the pipeline execution logs.
  12. (Optional) If you execute your pipeline at this stage, you will see in the logs that the 4 records you saw in the data preview were passed according to the filter you defined:
    The Logs panel indicates that 7 records have been read, and 4 records have been produced during the pipeline execution.
  13. Go back to the Configuration tab of the Filter processor to add and assign a variable:
    In the Configuration panel of the Filter processor, the X icon that allows you to add context variables is highlighted.
    1. Click the icon next to the Value field to open the [Assign a variable] window.
    2. Click Add variable.
    3. Give a name to your variable, balance_amount for example.
    4. Enter the variable value that will overwrite the default value, $1,000 here.
    5. Enter a description if needed and click Add.
    6. Now that your variable is created, you are redirected to the [Assign a variable] window that lists all context variables. Select yours and click Assign.
      In the 'Assign a variable' window, the new variable is selected and the 'Assign' button is enabled.
      Your variable and its value are assigned to the Value field of the filter, which means the $1,000 value will overwrite the $3,000 value you have defined previously.
    7. Click Save to save your configuration.
  14. On the top toolbar of Talend Cloud Pipeline Designer, click the Run button to open the panel allowing you to select your run profile.
  15. Select your run profile in the list (for more information, see Run profiles), then click Run to run your pipeline.

Results

Your pipeline is being executed, the data is filtered according to the context variable you have assigned to the filtering value. In the pipeline execution logs you can see:
  • the context variable value used at execution time
    In the Logs panel, the information related to the context variables used at runtime is highlighted.
  • the number of produced records, in that case 7 records meet the criteria which means the 7 user records have an account balance superior or equal to 1,000 dollars
    The Logs panel indicates that 7 records have been read, and 7 records have been produced during the pipeline execution.