Component code generation model

EnrichVersion
6.4
6.3
6.2
6.1
EnrichProdName
Talend Open Studio for ESB
Talend Data Fabric
Talend ESB
Talend Big Data Platform
Talend Open Studio for MDM
Talend Big Data
Talend Open Studio for Data Integration
Talend Real-Time Big Data Platform
Talend Data Integration
Talend MDM Platform
Talend Open Studio for Big Data
Talend Data Services Platform
Talend Data Management Platform
task
Design and Development > Designing Components
EnrichPlatform
Talend Studio

Component code generation model

This page presents the different component code generation models available in Talend Studio.

Talend Studio is a Java code generator tool. The Job is translated automatically to a Java class. Each component of the Job is divided into three parts of Java code in the Job generated code: begin, main and end. The begin and end parts are executed only once in a Job while the main part can be executed as many times as the number of lines being processed through the component. Here are some examples of Job designs based on their connections.

Row

The Row type connectors include Main, Lookup, Filter, Reject, Uniques and Duplicates and are the most commonly used links in Jobs. They are used to transfer data from one component to another component, acting like a bridge to let data flow through the Job.

Let's take a very common Job consisting of an input component, a tMap, an output component and the required Row > Main links:

The code generation sequence for this Job design is as below:

This example explains how the three parts of the three components are organized, and how they work together.

tFOX begin (initializes variables, such as filepath, opens an output stream)
tMap begin (initializes some constants)
tFID begin (opens an input stream, fetches all data from data source and caches them in memory,then starts a loop to deliver the rows line by line)
--------------------
tFID main  (instantiates main row, copies the data from row1 to main)
tMap main  (this is the mapper complex Java generated code)
tFOX main  (writes an XML line, encapsulating each field)
--------------------
tFID end   (stops loop, closes the input stream)
tMap end   (does nothing)
tFOX end   (writes last XML line, closes the output stream)

For a Row connection, the rules of code generation order are:

  • Each component generates three parts of code, begin part, main part and end part.
  • For the begin part, components generate the corresponding code in the reverse order compared to the Job design.
  • For the main part, components generate the corresponding code in the same order compared to the Job design.
  • For the end part, components generate the corresponding code in the same order compared to the Job design.

Iterate

There are usually two types of Job design using Iterate connections.

The first one is a Job containing one of the following components with an Iterate connection like the following one:

  • tFileList
  • tLoop
  • tForEach
  • tWaitForFile (or tWaitForSocket, tWaitForSqlData)
  • tMysqlTableList (or tOracleTableList, tMsSQLTableList, etc)

This example Job which contains a tFileList using an Iterate connection to iterate each file in the selected directory.

The components' code generation order for this kind of Job design is as below:

tFL  begin (opens a loop to iterate each file)
tFL  main  (does nothing)
--------------------
tFOX begin
tMap begin
tFID begin
--------------------
tFID main
tMap main
tFOX main
--------------------
tFID end
tMap end
tFOX end
--------------------
tFL end (closes the loop)

The above figure shows that, the code of the begin part of the component with an Iterate connection output is generated first and executed to start a loop to iterate each item. Then, at the end of the process, the code of the end part of the component is executed to close the loop.

Another type of Job using an Iterate connection is a Job containing a tFlowToIterate component, like the example Job below:

The components' code generation order for this kind of Job design is as below:

Merge

There is also another type of code generation order, when the Job has two or more input branches joined in a merge model with a tUnite component, as in the example Job below:

The components' code generation order for this kind of Job design is as below:

The above figure shows that there is always a main loop flow for each input branch and that the main part of the tUnite component is inside each loop of the input branches.