Before you begin
-
You have previously added the dataset holding your source
data.
Download and extract the file: split-leads.zip. It contains a dataset with a
list of customer leads including first names, last names, emails, addresses,
etc.
-
You also have created the connection and the related dataset
that will hold the processed data.
Here, a file stored in a Test Connection.
Procedure
-
Click Add
pipeline on the Pipelines page. Your new pipeline opens.
-
Click ADD SOURCE to open
the panel allowing you to select your source data, here a list of customer leads
entered manually as a test dataset.
-
Select your dataset and click
Select in order to add it to the pipeline.
Rename it if needed.
-
Click and add a Field Selector processor to the pipeline.
The configuration panel opens.
-
Give a meaningful name to the processor.
Example
select main info
-
In the Selectors area:
-
Select .first_name in the Input
list and enter firstname in the
Output list, as you want to select and rename the
first_name field.
-
Click the + sign to add a new element and select
.last_name in the Input list
and enter lastname in the Output
list, as you want to select and rename the
last_name
field.
-
Click the + sign to add a new element and select
.email in the Input list and
enter email in the Output list,
as you want to select the
email
field.
-
Click Save to
save your configuration.
(Optional) Look at the preview of the processor to compare your data before and
after the restructuring operation.
-
Click and add a Split processor to the pipeline. The
configuration panel opens.
-
Give a meaningful name to the processor.
Example
split emails
-
Configure the processor:
-
Select Extract email parts in
the Function name list, as you want to split the
local and the domain parts of the customers emails.
-
Select .email
in the Fields to process
field.
-
Click Save to save your configuration.
-
Click and add another Split processor to the pipeline. The
configuration panel opens.
-
Give a meaningful name to the processor.
Example
validate
companies
-
Configure the processor:
-
Select Extract values by
semantic type in the Function name list, as you want to validate the domain
part of the customers emails against company semantic types.
-
Select .email_domain in the Fields
to process field.
-
Select Company
in the Semantic type list.
-
Click Save to save your configuration.
-
(Optional) Look at the preview of the Split processor to see your data after the
extract operation.
Example
-
Click ADD DESTINATION
and select the dataset that will hold your reorganized data.
Rename it if needed.
-
On the top toolbar of Talend Cloud Pipeline Designer,
click the Run button to open the panel allowing you to select
your run profile.
-
Select your run profile in the list (for more information, see Run profiles), then click Run to
run your pipeline.
Results
Your pipeline is being executed, the leads data has been processed,
customer companies have been validated against company semantic types and the output
flow is sent to the target system you have indicated.