Before you begin
-
You have previously created a connection to the system
storing your source data.
Here, a Test connection.
-
You have previously added the dataset holding your source
data.
Download and extract the file: numbers-airlines.zip. It contains a dataset
with data about airlines with the number of incidents and accidents as well as
fatalities, etc.
-
You also have created the connection and the related dataset
that will hold the processed data.
Here, a Test dataset.
Procedure
-
Click Add
pipeline on the Pipelines page. Your new pipeline opens.
-
Give the pipeline a meaningful name.
Example
Compare number of air crashes
and filter airlines
-
Click ADD SOURCE to open
the panel allowing you to select your source data, here data about airlines and
air crashes.
Example
-
Select your dataset and click
Select in order to add it to the pipeline.
Rename it if needed.
-
Click and add a Number processor to the pipeline. The
configuration panel opens.
-
Give a meaningful name to the processor.
Example
compare number of fatal
accidents
-
In the Configuration
area:
-
Select Compare
numbers in the Function
name list.
-
Select .fatal_accidents_85_99
in the Fields to process list as
you want to compare this field value (number of fatal accidents) with
your custom value.
-
Enable the Create new column option and name it
comparison.
-
Select greater or equals than in the
Compare mode list, select
Value in the Use with
list and enter 2 in the
Value field. This way you will be able to
compare the number of fatal accidents per airlines and see which ones
have had at least two fatal accidents.
-
Click Save to
save your configuration.
You can preview your data before and after the comparison.
Values have been compared and a new field allows you to display which
airlines have had at least two fatal accidents (true)
and which have had less than two fatal accidents
(false).
-
Click and add a Filter processor to the pipeline. The
configuration panel opens.
-
Give a meaningful name to the processor.
Example
airlines with at least 2 fatal
accidents
-
In the Filters area:
-
Select .comparison in the Input
list, as you want to filter airlines based on this value.
-
Select None in the Optionally select a
function to apply list, as you do not want to apply a function
while filtering records.
-
Select == in the Operator list
and type in true in the Value
list as you want to filter on airlines who met the requirement of "2 or
more fatal accidents".
-
Click Save to
save your configuration.
Look at the preview of the processor to compare your data
before and after the operation.
-
Click ADD DESTINATION and select the dataset that will hold
your processed data.
Rename it if needed.
-
On the top toolbar of Talend Cloud Pipeline Designer,
click the Run button to open the panel allowing you to select
your run profile.
-
Select your run profile in the list (for more information, see Run profiles), then click Run to
run your pipeline.
Results
Your pipeline is being executed, the data is compared and filtered according
to the conditions you have stated and you can see that 19 airlines in this dataset
have had at least two fatal accidents. The output is sent to the target system you
have indicated.