Creating a dataset - Cloud

Talend Cloud Pipeline Designer User Guide

EnrichVersion
Cloud
EnrichProdName
Talend Cloud
EnrichPlatform
Talend Pipeline Designer
task
Administration and Monitoring > Monitoring executions
Administration and Monitoring > Monitoring logs
Data Governance > Filtering data
Data Quality and Preparation > Filtering data
Data Quality and Preparation > Managing datasets
Deployment > Deploying > Executing Pipelines
Design and Development > Designing Pipelines
How to create a dataset from scratch.

Procedure

  1. Go to Datasets > ADD DATASET.
  2. In the Add a new dataset panel, give a name to your dataset and select the connection in which you want to create your dataset.
    If you want to add a dataset from a connection that does not exist yet, you can create this connection directly from the connection drop-down list.
  3. Add a description if needed, and fill in the required properties of the dataset.
    • For S3 and HDFS file storage connections, an AUTO DETECT button allows you to automatically detect and fill in the format of your data (CSV, Excel, Avro or Parquet).

    • The database query and table types are not compatible as you cannot use a query type database as a Destination dataset. Therefore if you try to change the database configuration to another type after saving it, a check will be triggered on your pipeline to see whether this operation is possible.

  4. (Optional) Click VIEW SAMPLE to see a preview of the first 50 records of your dataset sample.
  5. click VALIDATE to save your dataset.

Results

The new dataset is added to the list on the Datasets page and is ready to be used.
Once created, you can go to the dataset detailed view to display a sample of your data in different formats:
  • Grid: from this view you can display the first 10 000 records of your data in tabular form
  • Hierarchy: from this view you can display the first 10 000 records of your data in a tree-like structure
  • Raw: from this view you can display an untouched and unfiltered version of the first 10 000 records of your data