Creating the dataset in Talend Data Preparation - 7.1

Data Preparation

author
Talend Documentation Team
EnrichVersion
Cloud
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
task
Data Governance > Third-party systems > Data Preparation components
Data Quality and Preparation > Third-party systems > Data Preparation components
Design and Development > Third-party systems > Data Preparation components
EnrichPlatform
Talend Data Preparation
Talend Studio

Procedure

  1. In the design workspace, select tDatasetOutput and click the Component tab to define its basic settings.
  2. Click the Sync columns button to retrieve the schema from the previous component, or configure the schema manually by selecting Built-in from the Schema list and clicking the [...] button next to Edit schema.
  3. In the URL field, type the URL of the Talend Data Preparation or Talend Cloud Data Preparation web application, between double quotes. Port 9999 is the default port for Talend Data Preparation.
  4. In the Email field, type the email address that you use to log in the Talend Data Preparation or Talend Cloud Data Preparation web application, between double quotes.
  5. In the Password field, type your password for the Talend Data Preparation or Talend Cloud Data Preparation web application, between double quotes.
    If you are working with Talend Cloud Data Preparation and if:
    • MFA ( Multi Factor Authentication) is enabled, enter an access token in the field.
    • MFA is not enabled but SSO (Single Sign-On) is configured, enter either an access token or your password in the field.

      It is recommend to use tokens as passwords will soon be obsolete and disappear.

    • MFA is not enabled and SSO is not configured, enter either an access token or your password in the field.

    The user those credentials belong to, will be the owner of the newly created dataset. He will also be the one to have the possibility to share this dataset to other users.

  6. Select the Create mode from the Mode drop-down list.

    Setting the mode to Update allows you to use the input to update the dataset defined in the Dataset Name field.

  7. In the Dataset Name field, enter a name for your dataset, between double quotes, tDatasetOutput_test in this example.
  8. In the Limit field, enter a number that is at least equal to the number of rows of your input file. In this example, the limit is 500 because the input table has 500 rows.