Talend Data Preparation concepts - 2.3

Talend Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
6.5
2.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation
These definitions will help you understand the main concepts in Talend Data Preparation.
Dataset
A dataset holds the raw data that can be used as the raw material for one or more preparations. It is presented as a table on which you can apply recipe steps without affecting the original data. A dataset can be reused across preparations.
Function
A function is an action applied on a row or a column in a dataset such as removing empty rows. As functions are applied as part of a preparation, they do not modify the original data. Applied functions are recorded, in sequence, into recipes.
Preparation
A preparation is what links a dataset and a recipe together: it is the final outcome that you want to achieve with your data. You can export this outcome as a file or connect it to data targets. A preparation takes one dataset and applies a recipe to produce an outcome. The original dataset is never modified.
Recipe
A recipe is literally defined as "a set of directions with a list of ingredients for making or preparing something". In Talend Data Preparation, the ingredients are the raw data, called datasets, and the directions are the set of functions applied to the dataset. Visually, the recipe is the top-down sequence of functions in the left collapsible panel. A recipe is linked to the dataset through a preparation. Every update of the recipe is automatically saved in the preparation all the time.