Scenario: Inserting bulk data into your Salesforce.com - 6.1

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

This scenario describes a four-component Job that submits bulk data into Salesforce.com, executs your intended actions on the data, and ends up with displaying the Job execution results for your reference.

Before starting this scenario, you need to prepare the input file which offers the data to be processed by the Job. In this use case, this file is sforcebulk.txt, containing some customer information.

Then to create and execute this Job, operate as follows:

Setting up the Job

  1. Drop tFileInputDelimited, tSalesforceOutputBulkExec, and tLogRow from the Palette onto the workspace of your studio.

  2. Use Row > Main connection to connect tFileInputDelimited to tSalesforceOutputBulkExec.

  3. Use Row > Main and Row > Reject to connect tSalesforceOutputBulkExec respectively to the two tLogRow components.

Setting the input data

  1. Double-click tFileInputDelimited to display its Basic settings view and define the component properties.

  2. From the Property Type list, select Repository if you have already stored the connection to the salesforce server in the Metadata node of the Repository tree view. The property fields that follow are automatically filled in. If you have not defined the server connection locally in the Repository, fill in the details manually after selecting Built-in from the Property Type list.

    For more information about how to create the delimited file metadata, see Talend Studio User Guide.

  3. Next to the File name/Stream field, click the [...] button to browse to the input file you prepared for the scenario, for example, sforcebulk.txt.

  4. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box where you can set the schema manually. In this scenario, the schema is made of four columns: Name, ParentId, Phone and Fax.

  5. According to your input file to be used by the Job, set the other fields like Row Separator, Field Separator...

Setting up the connection to the Salesforce server

  1. Double-click tSalesforceOutputBulkExec to display its Basic settings view and define the component properties.

  2. In Salesforce WebService URL field, use the by-default URL of the Salesforce Web service or enter the URL you want to access.

  3. In the Username and Password fields, enter your username and password for the Web service.

  4. In the Bulk file path field, browse to the directory where you store the bulk .csv data to be processed.

    Note

    The bulk file here to be processed must be in .csv format.

  5. From the Action list, select the action you want to carry out on the prepared bulk data. In this use case, insert.

  6. From the Module list, select the object you want to access, Account in this example.

  7. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box where you can set the schema manually. In this example, edit it conforming to the schema defined previously.

Job execution

  1. Double-click tLogRow_1 to display its Basic settings view and define the component properties.

  2. Click Sync columns to retrieve the schema from the preceding component.

  3. Select Table mode to display the execution result.

  4. Do the same with tLogRow_2.

  5. Press CTRL+S to save your Job and press F6 to execute it.

    On the console of the Run view, you can check the execution result.

    In the tLogRow_1 table, you can read the data inserted into your Salesforce.com.

    In the tLogRow_2 table, you can read the rejected data due to the incompatibility with the Account objects you have accessed.

    If you want to transform the input data before submitting them, you need to use tSalesforceOutputBulk and tSalesforceBulkExec in cooperation to achieve this purpose. For further information on the use of the two components, see Scenario: Inserting transformed bulk data into your Salesforce.com.