Skip to main content

Storage

AWS Cloud provides the following Storage services:

Icon Name Description
Amazon S3

Amazon Simple Storage Service (S3) provides customers with an object storage service. It is the storage infrastructure that AWS uses for various other services. Amazon S3 provides a range of storage classes designed for different use cases including:

  • General-purpose storage of frequently accessed data
  • S3 Standard - Infrequent Access for long-lived data that is less frequently accessed
  • Amazon Glacier for long-term archive

These are controlled by configurable policies on the data.

Talend Usage:

  • Talend provides components for connectivity in Jobs and Services
Amazon Glacier

As mentioned above, Amazon Glacier is built on top of S3. It leverages the same infrastructure. Amazon Glacier is a service for a secure, durable and extremely low cost storage service for long-term backup and data archiving. Retrieving data from Amazon Glacier will take several hours for the request to be processed due to the nature of the service. It is like a replacement for tape. Customer should not use Glacier for frequently accessed data. The use-case is to retrieve data from Glacier once every few years, in case of a disaster recover.

Talend Usage:

  • Talend provides S3 components to load data into S3. Since the Amazon Glacier files and folders can be controlled using S3 policies, the same components can write to Glacier.
Amazon EBS

Amazon Elastic Block Store (EBS) provides persistent block level storage volumes for use with Amazon EC2 instances in the cloud. Amazon EBS provides raw block IO access and is suitable to be attached to 1 server instance. If we need storage to be attached to multiple instances, then we need Amazon Elastic File System (EFS) which exposes the NFSv4 protocol.

Talend Usage:

  • Use as a disk drive when the EBS is attached to the EC2 instance.
Amazon EFS

Amazon Elastic File System (EFS) is a simple, scalable file storage system, exposing the NFSv4 protocol, for use with multiple EC2 instances at the same time in the cloud. With Amazon EFS, the storage capacity is elastic, growing and shrinking automatically as we add or remove files from system. We use Amazon EFS for the Job Archives folder when multiple Talend Administration Center (TAC) are clustered at the scheduler level.

Talend Usage:

  • Use as a disk drive when the EFS is attached to multiple EC2 instances. It is especially useful when clustering multiple Talend Administration Centers and having to share the content of the Job Archives folder from multiple Talend Administration Centers.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!