Configuring the connection to S3 to be used to store the business data - 7.3

Amazon EMR distribution

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs > Hadoop distributions > Amazon EMR

Procedure

  1. Double-click tS3Configuration to open its Component view.

    Spark uses this component to connect to the S3 system to which the business data is stored. In this scenario, the sample data about the street incidents is written and ecrypted on S3.

  2. Select the Inherit credentials from AWS role check box and the Use SSE-KMS encryption
  3. Enter the access credentials of the AWS account to be used.
    • If allowed by the security policy of your organization, in the Access Key and the Secret Key fields, enter the credentials.

      If you do not know the credentials to be used, contact the administrator of your AWS system or check Getting Your AWS Access Keys from the AWS documentation.

    • If the security policy of your organization does not allow you to expose the credentials in a client application, select Inherit credentials from AWS role to obtain the role-based temporary AWS security credentials from your EMR instance metadata. An IAM role must have been specified to associate with this EMR instance.

      For further information about using an IAM role to grant permissions, see Using IAM roles from the AWS documentation.

  4. Select Use SSE-KMS encryption check box to enable the Job to verify and use the SSE-KMS encryption service of your cluster.
  5. In the Bucket name field, enter the name of the bucket to be used to store the sample data. This bucket must have existed when you launch your Job. For example, enter my_bucket/my_folder.