tSalesforceOutputBulkExec - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tSalesforceOutputBulkExec Properties

tSalesforceOutputBulk and tSalesforceBulkExec components are used together to output the needed file and then execute intended actions on the file for your Salesforce.com. These two steps compose the tSalesforceOutputBulkExec component, detailed in a separate section. The interest in having two separate elements lies in the fact that it allows transformations to be carried out before the data loading.

Component family

Business/Cloud

 

Function

tSalesforceOutputBulkExec executes the intended actions on the .csv bulk data for Salesforce.com.

Purpose

As a dedicated component, tSalesforceOutputBulkExec gains performance while carrying out the intended data operations into your Salesforce.com.

Basic settings

Use an existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

Note

When a Job contains the parent Job and the child Job, Component List presents only the connection components in the same Job level.

 

Login Type

Two options are available:

Basic: select this option to log in to Salesforce.com by entering your Username/Password on tSalesforceConnection.

OAuth2: select this option to access Salesforce.com by entering your Consumer key/Consumer Secret on tSalesforceConnection. This way, your Username/Password will not be exposed to tSalesforceConnection but extra work is required:

 

Salesforce Webservice URL

Enter the Webservice URL required to connect to the Salesforce database.

 

Salesforce Version

Enter the Salesforce version you are using.

 

Username and Password

Enter your Web service authentication details.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

Consumer Key and Consumer Secret

Enter your OAuth authentication details. Such information is available in the OAuth Settings area of the Connected App that you have created at Salesforce.com.

To enter the consumer secret, click the [...] button next to the consumer secret field, and then in the pop-up dialog box enter the consumer secret between double quotes and click OK to save the settings.

For what a Connected App is, see Connected Apps. For how to create a Connected App, see Defining Remote Access Applications.

 

Callback Host and Callback Port

Enter your OAuth authentication callback url. This url (both host and port) is defined during the creation of a Connected App and will be shown in the OAuth Settings area of the Connected App.

 

Token File

Enter the token file name. It stores the refresh token that is used to get the access token without authorization.

 

Bulk file path

Directory where are stored the bulk data you need to process.

 

Action

You can do any of the following operations on the data of the Salesforce object:

Insert: insert data.

Update: update data.

Upsert: update and insert data.

Delete: delete data.

 

Upsert Key Column

Specify the key column for the upsert operation.

Available when Upsert is selected from the Action list.

 

Module

Select the relevant module in the list.

Note

If you select the Use Custom module option, you display the Custom Module Name field where you can enter the name of the module you want to connect to.

 

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

Click Sync columns to retrieve the schema from the previous component connected in the Job.

This component offers the advantage of the dynamic schema feature. This allows you to retrieve unknown columns from source files or to copy batches of columns from a source without mapping each column individually. For further information about dynamic schemas, see Talend Studio User Guide.

This dynamic schema feature is designed for the purpose of retrieving unknown columns of a table and is recommended to be used for this purpose only; it is not recommended for the use of creating tables.

Advanced settings

Rows to commit

Specify the number of lines per data batch to be processed.

 

Bytes to commit

Specify the number of bytes per data batch to be processed.

 

Concurrency mode

The concurrency mode for the job.

Parallel: process batches in parallel mode.

Serial: process batches in serial mode.

 

Wait time for checking batch state(milliseconds)

Specify the wait time for checking whether the batches in a Job have been processed until all batches are finally processed.

 

Use Socks Proxy

Select this check box if you want to use a proxy server. In this case, you should fill in the proxy parameters in the Proxy host, Proxy port, Proxy username and Proxy password fields which appear beneath.

 

Ignore NULL fields values

Select this check box to ignore NULL values in Update or Upsert mode.

 

Relationship mapping for upsert (for upsert action only)

Click the [+] button to add lines as needed and specify the external ID fields in the input flow, the lookup relationship fields in the upsert module, the lookup module as well as the external id fields in the lookup module.

Additionally, the Polymorphic check box must be selected when and only when polymorphic fields are used for relationship mapping. For details about the polymorphic fields, search polymorphic at http://www.salesforce.com/us/developer/docs/api_asynch/.

Column name of Talend schema: external ID field in the input flow.

Lookup field name: lookup relationship fields in the upsert module.

External id name: external ID field in the lookup module.

Polymorphic: select this check box when and only when polymorphic fields are used for relationship mapping.

Module name: name of the lookup module.

Note

  • Column name of Talend schema refers to the fields in the schema of the component preceding tSalesforceOutput. Such columns are intended to match against the external id fields specified in the External id name column, which are the fields of the lookup module specified in the Module name column.

  • Lookup field name refers to the lookup relationship fields of the module selected from the Module list in the Basic settings view. They are intended to establish relationship with the lookup module specified in the Module name column.

  • For how to define the lookup relationship fields and how to provide their correct names in the Lookup field name field, go to the Salesforce website and launch the Salesforce Data Loader application for proper actions and information.

  • Select the Polymorphic check box only for the polymorphic fields. You get an error if you omit this check box for a polymorphic field. You also get an error if you select it for a field that is not polymorphic.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component is mainly used when no particular transformation is required on the data to be loaded into Salesforce.com.

Limitation

The bulk data to be processed in Salesforce.com should be .csv format.

Scenario: Inserting bulk data into your Salesforce.com

This scenario describes a four-component Job that submits bulk data into Salesforce.com, executs your intended actions on the data, and ends up with displaying the Job execution results for your reference.

Before starting this scenario, you need to prepare the input file which offers the data to be processed by the Job. In this use case, this file is sforcebulk.txt, containing some customer information.

Then to create and execute this Job, operate as follows:

Setting up the Job

  1. Drop tFileInputDelimited, tSalesforceOutputBulkExec, and tLogRow from the Palette onto the workspace of your studio.

  2. Use Row > Main connection to connect tFileInputDelimited to tSalesforceOutputBulkExec.

  3. Use Row > Main and Row > Reject to connect tSalesforceOutputBulkExec respectively to the two tLogRow components.

Setting the input data

  1. Double-click tFileInputDelimited to display its Basic settings view and define the component properties.

  2. From the Property Type list, select Repository if you have already stored the connection to the salesforce server in the Metadata node of the Repository tree view. The property fields that follow are automatically filled in. If you have not defined the server connection locally in the Repository, fill in the details manually after selecting Built-in from the Property Type list.

    For more information about how to create the delimited file metadata, see Talend Studio User Guide.

  3. Next to the File name/Stream field, click the [...] button to browse to the input file you prepared for the scenario, for example, sforcebulk.txt.

  4. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box where you can set the schema manually. In this scenario, the schema is made of four columns: Name, ParentId, Phone and Fax.

  5. According to your input file to be used by the Job, set the other fields like Row Separator, Field Separator...

Setting up the connection to the Salesforce server

  1. Double-click tSalesforceOutputBulkExec to display its Basic settings view and define the component properties.

  2. In Salesforce WebService URL field, use the by-default URL of the Salesforce Web service or enter the URL you want to access.

  3. In the Username and Password fields, enter your username and password for the Web service.

  4. In the Bulk file path field, browse to the directory where you store the bulk .csv data to be processed.

    Note

    The bulk file here to be processed must be in .csv format.

  5. From the Action list, select the action you want to carry out on the prepared bulk data. In this use case, insert.

  6. From the Module list, select the object you want to access, Account in this example.

  7. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box where you can set the schema manually. In this example, edit it conforming to the schema defined previously.

Job execution

  1. Double-click tLogRow_1 to display its Basic settings view and define the component properties.

  2. Click Sync columns to retrieve the schema from the preceding component.

  3. Select Table mode to display the execution result.

  4. Do the same with tLogRow_2.

  5. Press CTRL+S to save your Job and press F6 to execute it.

    On the console of the Run view, you can check the execution result.

    In the tLogRow_1 table, you can read the data inserted into your Salesforce.com.

    In the tLogRow_2 table, you can read the rejected data due to the incompatibility with the Account objects you have accessed.

    If you want to transform the input data before submitting them, you need to use tSalesforceOutputBulk and tSalesforceBulkExec in cooperation to achieve this purpose. For further information on the use of the two components, see Scenario: Inserting transformed bulk data into your Salesforce.com.