Writing data to a cloud file storage (S3) - Cloud

Talend Cloud Pipeline Designer Getting Started Guide

Talend Cloud
Talend Pipeline Designer
Deployment > Deploying > Executing Pipelines
Design and Development > Designing Pipelines

Before you begin

  • Make sure your user or user group has the correct permissions to access the Amazon S3 resources.

    If you do not have these permissions you can try one of the following options.
    1. (recommended) Ask the administrator who manages your Amazon account to give you/your user the correct S3 permissions.
    2. Implement your access policy yourself by following the Amazon documentation if you are allowed to do so.
    3. (not recommended) Attach the AmazonS3FullAccess policy to your group/your user through the IAM console. This allows you to read and write to S3 resources without restrictions to a specific bucket. However this is a quick fix that is not recommended by Talend.
    Note: The default error that displays when trying to access S3 resources without sufficient permissions is Bad Gateway.
  • Retrieve the financial_transactions.avro file from the Downloads tab in the left panel of this page.

  • Create a Remote Engine Gen2 and its run profile from Talend Cloud Management Console.

    The Cloud Engine for Design and its corresponding run profile come embedded by default in Talend Cloud Management Console to help users quickly get started with the app, but it is recommended to install the secure Remote Engine Gen2 for advanced processing of data.


  1. Upload the financial_transactions.avro file to your Amazon S3 bucket as described in the Amazon S3 documentation.
  2. On the Home page of Talend Cloud Pipeline Designer, click Connections > Add connection.
  3. In the panel that opens, give a name to your connection, S3 connection for example.
  4. Select your Remote Engine Gen2 in the Engine list.
    Important: If the Remote Engine Gen2 does not have the AVAILABLE status that means it is up and running, you will not be able to select a Connection type in the list nor to save the new connection. The list of available connection types depend on the engine you have selected.
  5. Select S3 connection in the Connection type list.
  6. Enter your credentials and check your connection.
  7. Click Add dataset to point to the file that you have previously uploaded in your S3 bucket.
  8. In the Add a new dataset panel, fill in the connection information to your S3 bucket:
    1. Give a display name to your dataset, financial data on S3 for example.
    2. In the AWS bucket name field, select, or type the name of your S3 bucket.
    3. In the Object name field, type in the path to the financial_transactions.avro file you have previously uploaded to your S3 bucket.
    4. In the Format list, click Auto detect to automatically detect the format or select Avro in the list.
  9. Click View sample to check that your data is valid and can be previewed.
  10. Click Validate to save your dataset.


On the Datasets page, the new dataset is added to the list and can be used to reproduce the use case you have created previously.
Before executing this pipeline, select whether you want to overwrite the existing data on S3 or merge them in the configuration tab of the destination dataset:

Once your pipeline is executed, the updated data will be visible in the file located on Amazon S3.