Creating a test dataset - Cloud

Talend Cloud Pipeline Designer User Guide

Version
Cloud
Language
English
Product
Talend Cloud
Module
Talend Pipeline Designer
Content
Administration and Monitoring > Monitoring executions
Administration and Monitoring > Monitoring logs
Data Governance > Filtering data
Data Quality and Preparation > Filtering data
Data Quality and Preparation > Managing datasets
Deployment > Deploying > Executing Pipelines
Design and Development > Designing Pipelines
Last publication date
2024-02-09

How to create a dataset based on the schema that you enter manually.

Test datasets are useful for supplying a fixed set of values without requiring a real-life record store, making them simple to try out the product.

Procedure

  1. Go to Datasets > Add dataset .
  2. In the Add a new dataset panel, give a name to your Test dataset.
  3. Select the Test connection you have previously created in which you want to add your data.
  4. Select the format of your data:
    • CSV: in that case the expected format for the schema fields is the following:
      • must begin with [A-Za-z_] characters
      • can only contain [A-Za-z0-9_] characters
      • can only be separated by semicolons
      Example: First_Name;Last_Name;Phone1;Phone2;Address;State;Company
      Note: If you do not specify a format, a generic one will be created by default.
    • JSON: in that case you must respect a specific format for your JSON values and be consistent: sequence of records, one after another, separated, or not, by a line feed. Each record does not need to be on a single line. At the end, the data in the text area is not a typical JSON document with square brackets.

      Example:

        {
          "Id": 3146717,
          "PosTime": 1525097499899,
          "Latitude": 48.8585,
          "Longitude": 2.4921,
          "Operator": "Air France"
        }
        {
          "Id": 3757865,
          "PosTime": 1525097474634,
          "Latitude": 48.5018,
          "Longitude": 2.2246,
          "Operator": "Lufthansa"
        }
    • AVRO: in that case you also must enter the schema of your Avro records, which is a JSON document with a specific syntax described in the Avro Apache documentation.
  5. In the Values area, type in or paste your data.
    The size of your data cannot exceed 32 kilobytes.
    New dataset configuration page with manually-entered JSON values.
  6. (Optional) Click View sample to check that your data is valid.
  7. Click Validate to save your dataset.

Results

You are redirected to the dataset Overview panel where different information and metadata are displayed.

To visualize and understand the content of the dataset, open the Sample panel. You can then check that your data is valid.

Dataset sample panel
Sample panel showing a table view of the dataset JSON values.