Procedure
-
In the design workspace, select tDataprepRun
and click the Component tab to define its
basic settings.
-
In the URL field, type
the URL of the Talend Data Preparation or
Talend Cloud Data Preparation web application,
between double quotes. Port
9999
is the default port for Talend Data Preparation. -
In the Username and
Password fields, enter your Talend Data Preparation or Talend Cloud Data Preparation connection information,
between double quotes.
If you are working with Talend Cloud Data Preparation and if:
- SSO is enabled, enter an access token in the field.
- SSO is not enabled, enter either an access token or your password in the field.
-
Click Choose an existing
preparation to display a list of the prepations available in
Talend Data Preparation or Talend Cloud Data Preparation, and select datapreprun_spark.
This scenario assumes that a preparation with a compatible schema has been created beforehand.
A warning is displayed next to preparations containing incompatible actions, that only affect a single row or cell.
-
Click Fetch Schema to retrieve the schema of
the preparation.
The output schema of the tDataprepRun component now reflects the changes made with each preparation step. The schema takes into account columns that were added or removed for example. By default, the output schema will use the
String
type for all the columns, in order not to overwrite any formatting operations performed on dates or numeric values during the preparation.