Scenario: Uploading files to Dropbox - 6.3

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.3
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

In this scenario, a six-component Job consisting of three Subjobs is created to write data onto Dropbox using different upload modes.

Before replicating this scenario, you need to create a Dropbox App under the Dropbox account to be used. In this scenario, the Dropbox App to be used is named to talenddrop and thus the root folder in which files are uploaded is talenddrop, too. In addition, the access token to this folder has been generated from the App console provided by Dropbox.

For further information about a Dropbox App, see https://www.dropbox.com/developers/apps/.

Linking the components

  1. In the Integration perspective of the Studio, create an empty Job from the Job Designs node in the Repository tree view.

    For further information about how to create a Job, see Talend Studio User Guide.

  2. In the workspace, enter the name of the component to be used and select this component from the list that appears. In this scenario, the components are tDropboxConnection, tFixedFlowInput, tFileOutputDelimited, tFileInputRaw and two tDropboxPut components.

    The tFixedFlowInput component generates some data to be uploaded to Dropbox in this scenario. In the real-world case, you can use other components such as tMysqlInput or tMap in the place of tFixedFlowInut to design a sophisticated process to prepare your data to be handled.

  3. Connect tFixedFlowInput to tFileOutputDelimited using the Row > Main link.

  4. Do the same to connect tFileOutputDelimited to one of the two tDropboxPut components and connect tFileInputRaw to the other tDropboxPut component.

  5. Connect tDropboxConnection to tFixedFlowInput using the Trigger > On Subjob Ok link. Then connect tFixedFlowInput to tFileInputRaw using the same type of link.

Connecting to Dropbox

  1. Double-click tDropboxConnection to open its Component view.

  2. In the Access token field, paste the token that you have generated via the App console of Dropbox for accessing the Dropbox App folder to be used.

Generating the output stream

Defining the input data

  1. Double-click tFixedFlowInput to open its Component view.

    In this scenario, only three rows of sample data are created to indicate three countries and their calling codes.

    33;France
    86;China
    81;Japan
  2. Click the [...] button next to Edit schema to open the schema editor.

  3. Click the [+] button twice to add two rows and in the Column column, rename them to code and country, respectively.

  4. Click OK to validate these changes and accept the propagation prompted by the pop-up dialog box.

  5. In the Mode area, select the Use Inline Table radio button. The code and the country column have been automatically created in this table.

  6. Enter the sample data mentioned above in this table.

Defining the output stream

  1. Double-click tFileOutputDelimited to open its Component view.

  2. Select the Use output stream check box to write the data to be outputted into a given output stream.

  3. In the Output stream field, enter the code to define the output stream you need to write data in. In this scenario, it is the output stream of the tDropboxPut_1 component linked with the current component. Thus the code used to write the data reads as follows:

    ((java.io.OutputStream)globalMap.get("tDropboxPut_1_OUTPUTSTREAM"))

    Note that in this example code, the tDropboxPut component has the number 1 as its affix, which represents its component ID distributed automatically within this Job. If the tDropboxPut component you are using has a different ID, you need to adapt the code to that ID number.

  4. Click Edit schema to verify that the schema of this component is identical with that of the preceding tFixedFlowInput component. If not so, click the Sync columns button to make both of the schemas identical.

Exposing the tDropboxPut output stream

  1. Double-click the tDropboxPut component linked with tFileOutputDelimited to open its Component view.

  2. Select the Use existing connection check box to reuse the connection created by tDropboxConnection.

  3. In the Path field, enter the path pointing to the file you need to write data in, with a slash (/) at the beginning of the path. For example, enter /calling_code.csv.

  4. In the Upload mode area, select the Rename if Existing radio button.

  5. Select the Expose As OutputStream radio button to expose the output stream of this component so that the other component, tFileOutputDelimited in this scenario, can write data in the stream.

Defining the media data to be uploaded

  1. Double-click tFileInputRaw to open its Component view.

    This component is used to read a picture named esb_architecture.png into the data flow. In the real-world practice, this file can be of many other formats, such as pdf, xls, ppt or mp3.

  2. In the Filename field, enter the path or browse to the file you need to upload.

  3. In the Mode area, select the Read the file as a bytes array radio button.

Uploading the incoming contents

  1. Double-click the tDropboxPut component linked with tFileInputRaw to open its Component view.