Applies a preparation made using Talend Data Preparation in a standard Data Integration Job.
tDataprepRun fetches a preparation made using Talend Data Preparation and applies it to a set of data.
For more technologies supported by Talend, see Talend components.
Depending on the Talend product you are using, this component can be used in one, some or all of the following Job frameworks:
-
Standard: see tDataprepRun Standard properties.
Note: For reference, tDataprepRun can process datasets of up to 10 million rows and 100 columns (7GB) at a speed of around 200 rows per second (150kB/s) for a 60-step preparation (these figures are indicative and may vary). For better performance or datasets beyond 10 million rows, consider using Spark Jobs.The component in this framework is available in Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, and in Talend Data Fabric.
-
Spark Batch: see tDataprepRun properties for Apache Spark Batch.
The component in this framework is available in all subscription-based Talend products with Big Data and Talend Data Fabric.
-
Spark Streaming: see tDataprepRun properties for Apache Spark Streaming.
This component is available in Talend Real Time Big Data Platform and Talend Data Fabric.