tGSList - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Function

tGSList iterates on a list of objects which match the specified criteria in Google Cloud Storage.

Purpose

tGSList allows you to retrieve a list of objects from Google Cloud Storage one by one.

tGSList properties

Component Family

Big Data / Google Cloud Storage

 

Basic settings

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

 

Access Key and Secret Key

Type in the authentication information obtained from Google for making requests to Google Cloud Storage.

These keys can be consulted on the Interoperable Access tab view under the Google Cloud Storage tab of the project from the Google APIs Console.

To enter the secret key, click the [...] button next to the secret key field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

For more information about the access key and secret key, go to https://developers.google.com/storage/docs/reference/v1/getting-startedv1?hl=en/ and see the description about developer keys.

Warning

The Access Key and Secret Key fields will be available only if you do not select the Use an existing connection check box.

 

Key prefix

Specify the key prefix so that only the objects whose keys begin with the specified string will be listed.

 

Delimiter

Specify the delimiter in order to list only those objects with key names up to the delimiter.

 

Specify project ID

Select this check box and in the Project ID field enter the project ID from which you want to retrieve a list of objects.

 

List objects in bucket list

Select this check box and complete the Bucket table to retrieve objects in the specified buckets.

  • Bucket name: type in the name of the bucket from which you want to retrieve objects.

  • Key prefix: type in the prefix to list only objects whose keys begin with the specified string in the specified bucket.

  • Delimiter: type in the delimiter to list only those objects with key names up to the delimiter.

Warning

If you select the List objects in bucket list check box, the Key prefix and Delimiter fields as well as the Specify project ID check box will not be available.

Advanced settings

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at the Job level as well as at each component level.

Global Variables

CURRENT_BUCKET: the current bucket name. This is a Flow variable and it returns a string.

CURRENT_KEY: the current key. This is a Flow variable and it returns a string.

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

The tGSList component can be used as a standalone component or as a start component of a process.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Related scenario

For a scenario in which tGSList is used, see Scenario: Managing files with Google Cloud Storage