Filtering a list of customers based on their registration date and revenue - Cloud

Talend Cloud Pipeline Designer Processors Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Design and Development > Designing Pipelines
EnrichPlatform
Talend Pipeline Designer

Before you begin

  • You have previously created a Connection to the system storing your source data.

    Here, a connection to a database.

  • You have previously added the Dataset holding your source data.

    Here, a list of customers with a registration date field that you can find attached to this document (download the filter-python-customers.json file from the Downloads tab in the left panel of this page).

  • You also have created the Connection and the related Dataset that will hold the processed data.

    Here the files are stored on HDFS.

Procedure

  1. Click ADD PIPELINE on the PIPELINES page. Your new Pipeline opens.
  2. Give the Pipeline a meaningful name.
    Filter on Registration and Revenue
  3. Click ADD SOURCE to open the panel allowing you to select your source data, here a list of customers stored in a database.
  4. Select your Dataset and click SELECT DATASET in order to add it to the Pipeline.
    Rename it if needed.
  5. Click and add a Filter processor to the Pipeline. The Configuration panel opens.
  6. Give a meaningful name to the processor.
    customers registered in 2000
  7. In the Filter area:
    1. Select .RegistrationDate in the Field path list, as you want to filter customers based on this value.
    2. Select NONE in the Apply a function first list, as you do not want to apply a function while filtering records.
    3. Select CONTAINS in the Operator list and type in 2000 in the Value list as you want to filter on customers whose registration date contains the year 2000.

      You can use the avpath syntax in this area.

  8. Click SAVE to save your configuration.
  9. Click and add another Filter processor to the Pipeline. The Configuration panel opens.
  10. Give a meaningful name to the processor.
    customers with revenue > 90000
  11. In the Filter area:
    1. Select Revenue in the Field path list, as you want to filter customers based on this value.
    2. Select NONE in the Apply a function first list, as you do not want to apply a function while filtering records.
    3. Select > in the Operator list and type in 90000 in the Value list as you want to filter on customers with a revenue superior to 90000.
  12. Click SAVE to save your configuration.
  13. Click the button next to the first Filter processor to add and select the Dataset that will hold your rejected data.
  14. Give a meaningful name to the Destination.
    other registration date
  15. Click the ADD DESTINATION item next to the second Filter processor and select the Dataset that will hold your rejected data.
    Rename it if needed.
  16. Click the button next to the second Filter processor and select the Dataset that will hold your rejected data.
  17. Give a meaningful name to the Destination.
    other customers
  18. (Optional) Click the top preview icon after the last Filter processor to preview your data after the filtering operation.
  19. On the top toolbar of Talend Cloud Pipeline Designer, select your Run Profile in the list (for more information, see Execution profiles).
  20. Click the run icon to run your Pipeline.

Results

Your Pipeline is being executed, the data is filtered according to the conditions you have stated and the output is sent to the target system you have indicated.