tS3Copy - 6.3

Talend Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tS3Copy copies an Amazon S3 object from a source bucket to a destination bucket.

Purpose

This component is used to copy an Amazon S3 object.

tS3Copy properties

Component family

Cloud/Amazon/S3

Basic settings

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

 

Access Key

Specify the Access Key ID that uniquely identifies an AWS Account. For how to get your Access Key and Access Secret, visit Getting Your AWS Access Keys.

 

Secret Key

Specify the Secret Access Key, constituting the security credentials in combination with the access Key.

To enter the secret key, click the [...] button next to the secret key field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Inherit credentials from AWS role

Select this check box to obtain AWS security credentials from Amazon EC2 instance metadata. To use this option, the Amazon EC2 instance must be started and your Job must be running on Amazon EC2. For more information, see Using an IAM Role to Grant Permissions to Applications Running on Amazon EC2 Instances.

 

Assume role

Select this check box and specify the values for the following parameters used to create a new assumed role session.

  • Role ARN: the Amazon Resource Name (ARN) of the role to assume.

  • Role session name: an identifier for the assumed role session.

  • Session duration (minutes): the duration (in minutes) for which we want to have the assumed role session to be active.

For more information about assuming roles, see AssumeRole.

 

Region

Specify the AWS region by selecting a region name from the list or entering a region between double quotation marks (e.g. "us-east-1") in the list. For more information about the AWS Region, see Regions and Endpoints.

Source Configuration

Bucket

Specify the name of the source bucket that contains the object to be copied.

 

Key

Specify the key of the object to be copied.

Destination Configuration

Bucket

Specify the name of the destination bucket to which the object will be copied.

 

Key

Specify the new key for the object after being copied to the destination bucket.

 

Server-Side Encryption

Select this check box to enable server-side encryption to protect your data sent to Amazon S3 using Amazon S3-Managed Encryption Keys (SSE-S3).

For more information about server-side encryption with SSE-S3, see Protecting Data Using Server-Side Encryption with Amazon S3-Managed Encryption Keys (SSE-S3).

 

Die on error

Select this check box to stop the execution of the Job when an error occurs.

Clear the check box to skip any rows on error and complete the process for error-free rows.

Advanced settings

Config client

Select this check box and specify the client paramter(s) by clicking the [+] button to add as many rows as needed, each row for a client parameter, and then setting the value of the following fields for each parameter:

  • Client Parameter: click the cell and from the drop-down list displayed select the client parameter.

  • Value: enter the value for the selected parameter.

This check box is available only when the Use an existing connection check box is cleared.

STS Endpoint

Select this check box and in the field displayed, specify the AWS Security Token Service endpoint where session credentials are retrieved from.

This check box is available only when the Assume role check box is selected.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component can be used as a standalone component.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Scenario: Copying an S3 object from one bucket to another

This scenario describes a Job that uploads a new object to an existing empty S3 bucket bucket-src, then copies the object from the bucket bucket-src to another existing empty S3 bucket bucket-dst, finally lists the object in the bucket bucket-dst to see whether the object is successfully copied.

Setting up the Job

  1. Create a new Job and add a tS3Connection component, a tS3Put component, a tS3Copy component, a tS3List component, a tIterateToFlow component, and a tLogRow component by typing their names on the design workspace or dropping them from the Palette.

  2. Link the tS3List component to the tIterateToFlow component using a Row > Iterate connection.

  3. Link the tIterateToFlow component to the tLogRow component using a Row > Main connection.

  4. Link the tS3Connection component to the tS3Put component using a Trigger > On Subjob Ok connection.

  5. Do the same to link the tS3Put component to the tS3Copy component and the tS3Copy component to the tS3List component.

Configuring the components

Creating a connection to Amazon S3

  1. Double-click the tS3Connection component to open its Basic settings view on the Component tab.

  2. In the Access Key and Secret Key fields, enter the authentication credentials required to access Amazon S3.

  3. From the Region drop-down list, select an AWS region where the object will be uploaded and copied. In this example, we keep the default setting.

Uploading an object to an Amazon S3 bucket

  1. Double-click the tS3Put component to open its Basic settings view on the Component tab.

  2. Select the Use an existing connection check box to reuse the Amazon S3 connection information you have defined in the tS3Connection component.

  3. In the Bucket field, enter the name of the S3 bucket where the object will be uploaded. In this example, it is bucket-src that already exists in Amazon S3.

  4. In the Key field, enter the key for the object to be uploaded. In this example, it is tS3Copy_icon32_src.png.

  5. In the File field, browse to or enter the path to the object to be uploaded. In this example, it is D:/tS3Copy_icon32.png.

Copying the uploaded object to another Amazon S3 bucket

  1. Double-click the tS3Copy component to open its Basic settings view on the Component tab.

  2. Select the Use an existing connection check box to reuse the Amazon S3 connection information you have defined in the tS3Connection component.

  3. In the Bucket field in the Source Configuration area, enter the name of the bucket which contains the object to be copied. In this example, it is bucket-src.

  4. In the Key field in the Source Configuration area, enter the key of the object to be copied. In this example, it is tS3Copy_icon32_src.png.

  5. In the Bucket field in the Destination Configuration area, enter the name of the bucket to which the object will be copied. In this example, it is the empty one bucket-dst that already exists in Amazon S3.

  6. In the Key field in the Destination Configuration area, enter the new key for the object after being copied to the destination bucket. In this example, it is tS3Copy_icon32_dst.png.

Listing the object in the destination bucket

  1. Double-click the tS3List component to open its Basic settings view on the Component tab.

  2. Select the Use an existing connection check box to reuse the Amazon S3 connection information you have defined in the tS3Connection component.

  3. Clear the List all buckets objects check box, and then click the [+] button to add one row in the Bucket table displayed and set the value for each column. In this example, bucket-dst for the Bucket name column and empty value for the Key prefix column, this way only the objects in the bucket-dst bucket will be listed.

  4. Double-click the tIterateToFlow component to open its Basic settings view on the Component tab.

  5. Click the [...] button next to Edit schema and in the pop-up schema dialog box define the schema by adding one column ObjectList of String type.

  6. Click OK to save the changes and in the pop-up dialog box click Yes to accept the propagation.

  7. Double-click the tLogRow component to open its Basic settings view on the Component tab.