There are various types of connections which define either the data to be processed, the data output, or the Job logical sequence.
Right-click a component on the design workspace to display a contextual menu that lists all available links for the selected component.
The sections below describe all available connection types.
A Row connection handles the actual data. The Row connections can be main, lookup, reject or output according to the nature of the flow processed.
This type of row connection is the most commonly used connection. It passes on data flows from one component to the other, iterating on each row and reading input data according to the component properties setting (schema).
Data transferred through main rows are characterized by a schema definition which describes the data structure in the input file.
You cannot connect two Input components together using a Row > Main connection. Only one incoming Row connection is possible per component. You will not be able to link twice the same target component using a main Row connection. The second row linking a component will be called Lookup.
To connect two components using a Main connection, right-click the input component and select Row > Main on the connection list.
Alternatively, you can click the component to highlight it, then right-click it or click the O icon that appears on side of it and drag the cursor towards the destination component. This will automatically create a Row > Main type of connection.
For information on using multiple Row connections, see Multiple Input/Output.
This row link connects a sub-flow component to a main flow component (which should be allowed to receive more than one incoming flow). This connection is used only in the case of multiple input flows.
A Lookup row can be changed into a main row at any time (and reversely, a main row can be changed to a lookup row). To do so, right-click the row to be changed, and on the pop-up menu, click Set this connection as Main.
Related topic: Multiple Input/Output.
This row link connects specifically a tFilterRow component to an output component. This row link gathers the data matching the filtering criteria. This particular component offers also a Reject link to fetch the non-matching data flow.
This row link connects a processing component to an output component. This row link gathers the data that does NOT match the filter or are not valid for the expected output. This link allows you to track the data that could not be processed for any reason (wrong type, undefined null value, etc.). On some components, this link is enabled when the Die on error option is deactivated. For more information, refer to the relevant component properties available in Talend Open Studio for Big Data Components Reference Guide.
This row link connects a tMap component to an output component. This link is enabled when you clear the Die on error check box in the tMap editor and it gathers data that could not be processed (wrong type, undefined null value, unparseable dates, etc.).
Related topic: Handling errors.
This row link connects a tMap component to one or several output components. As the Job output can be multiple, you get prompted to give a name for each output row created.
The system also remembers deleted output link names (and properties if they were defined). This way, you do not have to fill in again property data in case you want to reuse them.
Related topic: Multiple Input/Output.
Some components help handle data through multiple inputs and/or multiple outputs. These are often processing-type components such as the tMap.
If this requires a join or some transformation in one flow, you want to use the tMap component, which is dedicated to this use.
For further information regarding data mapping, see Mapping data flows.
For properties regarding the tMap component as well as use case scenarios, see Talend Open Studio for Big Data Components Reference Guide.
The Iterate connection can be used to loop on files contained in a directory, on rows contained in a file or on DB entries.
A component can be the target of only one Iterate link. The Iterate link is mainly to be connected to the start component of a flow (in a subjob).
Some components such as the tFileList component are meant to be connected through an iterate link with the next component. For how to set an Iterate connection, see Iterate connection settings.
The name of the Iterate link is read-only unlike other types of connections.
Trigger connections define the processing sequence, so no data is handled through these connections.
The connection in use will create a dependency between Jobs or subjobs which therefore will be triggered one after the other according to the trigger nature.
Trigger connections fall into two categories:
subjob triggers: On Subjob Ok, On Subjob Error and Run if,
component triggers: On Component Ok, On Component Error and Run if.
OnSubjobOK (previously Then Run): This link is used to trigger the next subjob on the condition that the main subjob completed without error. This connection is to be used only from the start component of the Job.
These connections are used to orchestrate the subjobs forming the Job or to easily troubleshoot and handle unexpected errors.
OnSubjobError: This link is used to trigger the next subjob in case the first (main) subjob do not complete correctly. This "on error" subjob helps flagging the bottleneck or handle the error if possible.
Related topic: How to define the Start component.
OnComponentOK and OnComponentError are component triggers. They can be used with any source component on the subjob.
OnComponentOK will only trigger the target component once the execution of the source component is complete without error. Its main use could be to trigger a notification subjob for example.
OnComponentError will trigger the sub-job or component as soon as an error is encountered in the primary Job.
Run if triggers a subjob or component in case the condition defined is met. For how to set a trigger condition, see Run if connection settings.
The Link connection can only be used with ELT components. These links transfer table schema information to the ELT mapper component in order to be used in specific DB query statements.
Related topics: ELT components in Talend Open Studio for Big Data Components Reference Guide.
The Link connection therefore does not handle actual data but only the metadata regarding the table to be operated on.
When right-clicking the ELT component to be connected, select Link > New Output.
Be aware that the name you provide to the link must reflect the actual table name.
In fact, the link name will be used in the SQL statement generated through the ETL Mapper, therefore the same name should never be used twice.