- Click the Export button in the application header
- If the result of your preparation is larger than your current sample size, 10
000 rows by default, select an export option:
- If you select Current sample, only the sample you have been working on will be exported.
- If you select All data, all the preparations steps you have performed on your sample will be applied to the rest of the dataset as well.
- Choose between exporting your data to a local file, or to a Hadoop
- If you export your data as a csv or xlsx local file, the export operation will be processed on the Talend Data Preparation server.
- If you export your data to the Hadoop cluster, the export operation will be processed directly on the cluster. Choose the type of your output file between csv, avro or parquet. Enter the path to your prefered location on the cluster to save your file, and if you choose to authenticate via Kerberos, enter your principal and the path to your keytab file.
- Click Confirm.
In the case of an export to a local file, if you chose to export only the Current sample, the download automatically starts. But if you selected All data to export the entire data, the export process is launched in the background. You can check the status of the export, and download your output file in the Export history page. For more information, see The export history page.
The export process triggers a refresh in the data that is fetched from the database, guaranteeing that the data displayed in the output is always up to date.
However, due to this refresh, it is possible that a dataset originally smaller than 10,000 rows, now exceeds this limit. In this case:
- If you export to a local file, only the sample is kept.
- If you export to a Hadoop cluster, the whole data is exported.