How to create a pipeline from scratch.
Procedure
- On the Home page, click Pipelines > Add pipeline.
- On the top toolbar, give a name to your pipeline.
- To add a source, click the ADD SOURCE placeholder on the canvas.
-
Select the dataset you want to use in your pipeline:
- If you have already created a dataset, select it from the list in the [Select a source] panel and click Select.
- If not, add a new dataset by clicking Add dataset as described in Creating a dataset from scratch.
-
Click the icon to select one or more processing components according to your needs:
filtering, cleansing, aggregating, etc.
From the [Add a processor] panel, you can either select a processor in the main list or enter its name or description in the text box.
- To add a destination, which is a target component that will consume your data and send it to the system of your choice, click the ADD DESTINATION placeholder on the canvas.
-
Select the destination dataset:
- If you have already created a dataset, select it from the list in the [Select a destination] panel and click Select.
- If not, add a new dataset by clicking Add dataset as described in Creating a dataset from scratch.
Example of pipeline consuming data from an S3 Input, processing, and filtering data to send some selected data to an another S3 destination.Note that, before executing your pipeline, you can see a preview of your data at each step of the design process.