In this Job, the results of the input flow are stored in a temporary location
(either in a file or in memory (cache)) to reduce the processing time when processing
large sets of data or if your input flow is complex.
This Job will use the following components:
-
A tFileInputDelimited, a tReplicate,
and two tMap components to create two input flows.
-
Two tHashOutput and tHashinput
components to store and use the results from a temporary location.
-
A third tMap component and a tLogRow
to print the results in the console.
Procedure
-
Create two input flows as shown above adding the
tFileInputDelimited, the Replicate,
the tMap and the tHashOutput components
on the workspace and creating Row > Main links
between.
-
Either use two tFileOutputDelimited components or
tHashOutput components to store the result from
tMap_1 or tMap_2 in a place.
-
Then read the data in the next subJob, from the temporary file using a
tFileInputDelimited component or from the memory using a
tHashInput component. The Job example above caches the
result into memory.
-
In the Basic settings view of
tHashIntput_1, select tHashOutput_1
from the Component list drop-down list.
This configuration links tHashInput_1 to tHashOutput_1.
Tip:
tHashOutput_1 is used to cache the result out from
tMap_1 into memory. tHashOutput_2
is used to cache the result out from tMap_2 into memory. In
order for the data to be retrieved from the memory, the
tHashInput_1 component must be linked with the
tHashOutput_1 component and the
tHashInput_2 with tHashOuput_2,
respectively.
-
In the Basic settings view of
tHashIntput_2, select tHashOutput_2
from the Component list drop-down list.
This configuration links tHashInput_2 to tHashOutput_2.
-
Then read the data in the next subJob, from the temporary file using a
tFileInputDelimited component or from the memory using a
tHashInput component. The Job example above caches the
result into memory.