Creating a Job with multiple paths from a single source to the same target - 7.3

Version
7.3
Language
English (United States)
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for ESB
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development > Designing Jobs

Creating a job from multiple paths from a single source to the same target

Creating a subjob with a multiple path structure from a single source to a single target, as shown in the capture below, is not allowed in the Talend Studio. In this document, you will find two workaround examples to solve this problem that provide the same result.

Storing the result of the input flow in a temporary location

In this Job, the results of the input flow are stored in a temporary location (either in a file or in memory (cache)) to reduce the processing time when processing large sets of data or if your input flow is complex.

This Job will use the following components:

  • a tFileInputDelimited, tReplicate and two tMap components to create two input flows.
  • Two tHashOutput and tHashinput components to store and use the results from a temporary location.
  • a third tMap component and a tLogRow. to print the results in the console.

Procedure

  1. Create two input flows as shown above adding the tFileInputDelimited, the Replicate, the tMap and the tHashOutput components on the workspace and creating row > main links between.
    Note: tHashInput and tHashOutput are components from the Technical family and are hidden by default.

    For more information about how to use these components, see the Where can I find the tHashInput/tHashOutput components? article on Talend Help Center (https://help.talend.com).

  2. Either use two tFileOutputDelimited components or tHashOutput components to store the result in place from tMap_1 or tMap_2.
  3. Then read the data in the next Subjob, from the temporary file using a tFileInputDelimited component or from the memory using a tHashInput component. The job example above caches the result into memory.
  4. Configure both tHashIntput components in the respective component Properties view to link them with the two tHashOutput components.
    Tip: tHashOutput_1 is used to cache the result out from tMap_1 into memory. tHashOutput_2 is used to cache the result out from tMap_2 into memory. In order for the data to be retrieved from the memory, the tHashInput_1 component must be linked with the tHashOutput_1 component and the tHashInput_2 with tHashOuput_2, respectively.
  5. Then read the data in the next Subjob, from the temporary file using a tFileInputDelimited component or from the memory using a tHashInput component. The job example above caches the result into memory.