Skip to main content

Dataset

  • In Talend Data Inventory: Datasets are collections of data. For example, they can be database tables, file names, topics (Kafka), or file paths (HDFS). You can also create test datasets that you enter manually and store in a test connection, or import local files as datasets. Several datasets can be connected to the same system (one-to-many connectivity) and are stored in reusable connections.
  • In Talend Data Preparation, Talend Cloud Pipeline Designer: A dataset holds the raw data that can be used as the raw material for one or more preparations. It is presented as a table on which you can apply recipe steps without affecting the original data. A dataset can be reused across preparations.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!