There are various types of connections which define either the data to be processed, the data output, or the Job logical sequence.
Right-click a component on the design workspace to display a contextual menu that lists all available connections for the selected component.
The sections below describe all available connection types.
A Row connection handles the actual data. The Row connections can be Main, Lookup, Reject, Output, Uniques/Duplicates, or Combine according to the nature of the flow processed.
This type of row connection is the most commonly used connection. It passes on data flows from one component to the other, iterating on each row and reading input data according to the component properties setting (schema).
Data transferred through main rows are characterized by a schema definition which describes the data structure in the input file.
You cannot connect two Input components together using a Row > Main connection. Only one incoming Row connection is possible per component. You will not be able to link twice the same target component using a main Row connection. The second Row connection will be called Lookup.
To connect two components using a Main connection, right-click the input component and select Row > Main on the connection list.
Alternatively, you can click the component to highlight it, then right-click it or click the O icon that appears on side of it and drag the cursor towards the destination component. This will automatically create a Row > Main type of connection.
For information on using multiple Row connections, see Multiple Input/Output.
This row connection connects a sub-flow component to a main flow component (which should be allowed to receive more than one incoming flow). This connection is used only in the case of multiple input flows.
A Lookup row can be changed into a main row at any time (and reversely, a main row can be changed to a lookup row). To do so, right-click the row to be changed, and on the pop-up menu, click Set this connection as Main.
Related topic: Multiple Input/Output.
This row connection connects specifically a tFilterRow component to an output component. This row connection gathers the data matching the filtering criteria. This particular component offers also a Reject connection to fetch the non-matching data flow.
This row connection connects a processing component to an output component. This row connection gathers the data that does NOT match the filter or are not valid for the expected output. This connection allows you to track the data that could not be processed for any reason (wrong type, undefined null value, etc.). On some components, this connection is enabled when the Die on error option is deactivated. For more information, refer to the relevant component properties available in Talend Components Reference Guide.
This row connection connects a tMap component to an output component. This connection is enabled when you clear the Die on error check box in the tMap editor and it gathers data that could not be processed (wrong type, undefined null value, unparseable dates, etc.).
Related topic: Handling errors.
This row connection connects a tMap component to one or several output components. As the Job output can be multiple, you get prompted to give a name for each output row created.
The system also remembers deleted output connection names (and properties if they were defined). This way, you do not have to fill in again property data in case you want to reuse them.
Related topic: Multiple Input/Output.
These row connection connect a tUniqRow to output components.
The Uniques connection gathers the rows that are found first in the incoming flow. This flow of unique data is directed to the relevant output component or else to another processing subjob.
The Duplicates connection gathers the possible duplicates of the first encountered rows. This reject flow is directed to the relevant output component, for analysis for example.
Some components help handle data through multiple inputs and/or multiple outputs. These are often processing-type components such as the tMap.
If this requires a join or some transformation in one flow, you want to use the tMap component, which is dedicated to this use.
For further information regarding data mapping, see Mapping data flows.
For properties regarding the tMap component as well as use case scenarios, see Talend Components Reference Guide.
The Iterate connection can be used to loop on files contained in a directory, on rows contained in a file or on DB entries.
A component can be the target of only one Iterate connection. The Iterate connection is mainly to be connected to the start component of a flow (in a subjob).
Some components such as the tFileList component are meant to be connected through an iterate connection with the next component. For how to set an Iterate connection, see Iterate connection settings.
The name of the Iterate connection is read-only unlike other types of connections.
Note that globalMap is thread unsafe. Be cautious when using
globalMap.get("key") to create your own global
variables and then retrieve their values in your Jobs, especially after an
Iterate connection with the
parallel execution option enabled.
Trigger connections define the processing sequence, so no data is handled through these connections.
The connection in use will create a dependency between Jobs or subjobs which therefore will be triggered one after the other according to the trigger nature.
Trigger connections fall into two categories:
subjob triggers: On Subjob Ok, On Subjob Error and Run if,
component triggers: On Component Ok, On Component Error and Run if.
OnSubjobOK (previously Then Run): This connection is used to trigger the next subjob on the condition that the main subjob completed without error. This connection is to be used only from the start component of the Job.
These connections are used to orchestrate the subjobs forming the Job or to easily troubleshoot and handle unexpected errors.
OnSubjobError: This connection is used to trigger the next subjob in case the first (main) subjob do not complete correctly. This "on error" subjob helps flagging the bottleneck or handle the error if possible.
Related topic: How to define the start component.
OnComponentOK and OnComponentError are component triggers. They can be used with any source component on the subjob.
OnComponentOK will only trigger the target component once the execution of the source component is complete without error. Its main use could be to trigger a notification subjob for example.
OnComponentError will trigger the sub-job or component as soon as an error is encountered in the primary Job.
Run if triggers a subjob or component in case the condition defined is met. For further information about Run if, see Run if connection settings.
For how to set a trigger condition, see Trigger connection settings.
It is possible to add checkpoints to certain trigger connections in order to be able to recover the execution of a Job from the last checkpoint previous to the error. For more information, see How to set checkpoints on trigger connections.
The Link connection can only be used with ELT components. These connections transfer table schema information to the ELT mapper component in order to be used in specific DB query statements.
Related topics: ELT components in Talend Components Reference Guide.
The Link connection therefore does not handle actual data but only the metadata regarding the table to be operated on.
When right-clicking the ELT component to be connected, select Link > New Output.
Be aware that the name you provide to the connection must reflect the actual table name.
In fact, the connection name will be used in the SQL statement generated through the ETL Mapper, therefore the same name should never be used twice.