tSalesforceOutputBulk - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

tSalesforceOutputBulk Properties

tSalesforceOutputBulk and tSalesforceBulkExec components are used together to output the needed file and then execute intended actions on the file for your Salesforce.com. These two steps compose the tSalesforceOutputBulkExec component, detailed in a separate section. The interest in having two separate elements lies in the fact that it allows transformations to be carried out before the data loading.

Component family

Business/Cloud

 

Function

tSalesforceOutputBulk generates files in suitable format for bulk processing.

Purpose

Prepares the file to be processed by tSalesforceBulkExec for executions in Salesforce.com.

Basic settings

File Name

Type in the directory where you store the generated file.

 

Append

Select the check box to write new data at the end of the existing data. Or the existing data will be overwritten.

 

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

Click Sync columns to retrieve the schema from the previous component connected in the Job.

This component offers the advantage of the dynamic schema feature. This allows you to retrieve unknown columns from source files or to copy batches of columns from a source without mapping each column individually. For further information about dynamic schemas, see Talend Studio User Guide.

This dynamic schema feature is designed for the purpose of retrieving unknown columns of a table and is recommended to be used for this purpose only; it is not recommended for the use of creating tables.

 

Ignore NULL fields values

Select this check box to ignore NULL values in Update or Upsert mode.

Advanced settings

Relationship mapping for upsert (for upsert action only)

Click the [+] button to add lines as needed and specify the external ID fields in the input flow, the lookup relationship fields in the upsert module, the lookup module as well as the external id fields in the lookup module.

Additionally, the Polymorphic check box must be selected when and only when polymorphic fields are used for relationship mapping. For details about the polymorphic fields, search polymorphic at http://www.salesforce.com/us/developer/docs/api_asynch/.

Column name of Talend schema: external ID field in the input flow.

Lookup field name: lookup relationship fields in the upsert module.

External id name: external ID field in the lookup module.

Polymorphic: select this check box when and only when polymorphic fields are used for relationship mapping.

Module name: name of the lookup module.

Note

  • Column name of Talend schema refers to the fields in the schema of the component preceding tSalesforceOutput. Such columns are intended to match against the external id fields specified in the External id name column, which are the fields of the lookup module specified in the Module name column.

  • Lookup field name refers to the lookup relationship fields of the module selected from the Module list in the Basic settings view. They are intended to establish relationship with the lookup module specified in the Module name column.

  • For how to define the lookup relationship fields and how to provide their correct names in the Lookup field name field, go to the Salesforce website and launch the Salesforce Data Loader application for proper actions and information.

  • Select the Polymorphic check box only for the polymorphic fields. You get an error if you omit this check box for a polymorphic field. You also get an error if you select it for a field that is not polymorphic.

 

tStatCatcher Statistics

Select this check box to gather the Job processing metadata at a Job level as well as at each component level.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

This component is intended for the use along with tSalesforceBulkExec component. Used together they gain performance while feeding or modifying information in Salesforce.com.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

Due to license incompatibility, one or more JARs required to use this component are not provided. You can install the missing JARs for this particular component by clicking the Install button on the Component tab view. You can also find out and add all missing JARs easily on the Modules tab in the Integration perspective of your studio. For details, see https://help.talend.com/display/KB/How+to+install+external+modules+in+the+Talend+products or the section describing how to configure the Studio in the Talend Installation Guide.

Scenario: Inserting transformed bulk data into your Salesforce.com

This scenario describes a six-component Job that transforms .csv data suitable for bulk processing, load them in Salesforce.com and then displays the Job execution results in the console.

This Job is composed of two steps: preparing data by transformation and processing the transformed data.

Before starting this scenario, you need to prepare the input file which offers the data to be processed by the Job. In this use case, this file is sforcebulk.txt, containing some customer information.

Then to create and execute this Job, operate as follows:

Setting up the Job

  1. Drop tFileInputDelimited, tMap, tSalesforceOutputBulk, tSalesforceBulkExec and tLogRow from the Palette onto the workspace of your studio.

  2. Use a Row > Main connection to connect tFileInputDelimited to tMap, and Row > out1 from tMap to tSalesforceOutputBulk.

  3. Use a Row > Main connection and a Row > Reject connection to connect tSalesforceBulkExec respectively to the two tLogRow components.

  4. Use a Trigger > OnSubjobOk connection to connect tFileInputDelimited and tSalesforceBulkExec.

Configuring the input component

  1. Double-click tFileInputDelimited to display its Basic settings view and define the component properties.

  2. From the Property Type list, select Repository if you have already stored the connection to the salesforce server in the Metadata node of the Repository tree view. The property fields that follow are automatically filled in. If you have not defined the server connection locally in the Repository, fill in the details manually after selecting Built-in from the Property Type list.

    For more information about how to create the delimited file metadata, see Talend Studio User Guide.

  3. Next to the File name/Stream field, click the [...] button to browse to the input file you prepared for the scenario, for example, sforcebulk.txt.

  4. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box to set the schema manually. In this scenario, the schema is made of four columns: Name, ParentId, Phone and Fax.

  5. According to your input file to be used by the Job, set the other fields like Row Separator, Field Separator...

Setting up the mapping

  1. Double-click the tMap component to open its editor and set the transformation.

  2. Drop all columns from the input table to the output table.

  3. Add .toUpperCase() behind the Name column.

  4. Click OK to validate the transformation.

Defining the output path

  1. Double-click tSalesforceOutputBulk to display its Basic settings view and define the component properties.

  2. In the File Name field, type in or browse to the directory where you want to store the generated .csv data for bulk processing.

  3. Click Sync columns to import the schema from its preceding component.

Setting up the connection to the Salesforce server

  1. Double-click tSalesforceBulkExect to display its Basic settings view and define the component properties.

  2. Use the by-default URL of the Salesforce Web service or enter the URL you want to access.

  3. In the Username and Password fields, enter your username and password for the Web service.

  4. In the Bulk file path field, browse to the directory where is stored the generated .csv file by tSalesforceOutputBulk.

  5. From the Action list, select the action you want to carry out on the prepared bulk data. In this use case, insert.

  6. From the Module list, select the object you want to access, Account in this example.

  7. From the Schema list, select Repository and then click the three-dot button to open a dialog box where you can select the repository schema you want to use for this component. If you have not defined your schema locally in the metadata, select Built-in from the Schema list and then click the three-dot button next to the Edit schema field to open the dialog box to set the schema manually. In this example, edit it conforming to the schema defined previously.

Configuring the output component

  1. Double-click tLogRow_1 to display its Basic settings view and define the component properties.

  2. Click Sync columns to retrieve the schema from the preceding component.

  3. Select Table mode to display the execution result.

  4. Do the same with tLogRow_2.

Job execution

  1. Press CTRL+S to save your Job.

  2. Press F6 to execute it.

    You can check the execution result on the Run console.

    In the tLogRow_1 table, you can read the data inserted into your Salesforce.com.

    In the tLogRow_2 table, you can read the rejected data due to the incompatibility with the Account objects you have accessed.

    All the customer names are written in upper case.