This basic scenario describes a four-component Job that reads a list of English words from a one-column delimited file, extracts the stems of the words, and displays both the list of words and the corresponding stems on the Run console.
Drop the following components from the Palette onto the design workspace: tFileInputDelimited, tMap, tStem, and tLogRow.
Link the tFileInputDelimited component to the tMap component using a Row > Main connection.
Link the tMap component to the tStem component using a Row > Main connection, and give the output row connection a name, out in this example.
Link the tStem component to the tLogRow component using a Row > Main connection.
Double-click the tFileInputDelimited component to open its Basic settings view.
Browse to the input file, and set basic properties based on the structure of the input file. In this example, the input file provides a list of English words in different variant forms, and does not have a header. The following is an exact of the file content.
computerize computerized computerizing program programming cooking cooked cooks evaporable
Click the [...] button next to Edit schema to open the [Schema] dialog box, and set the input schema, which should contain one column named Word in this example.
When done, click OK to close the dialog box.
Double-click the tMap component to open the map editor. We will use this component to map the single-column input flow to a two-column data flow to feed the tStem component.
Click the [+] button to add two columns to the output schema and name them Fullform and Stem respectively. Then, drag the Word column from the input table onto the Fullform column, then onto the Stem column, in the output table.
When done, click OK to close the map editor and propagate the changes to the next component.
Double-click the tStem component to open its Basic settings view.
In the Select Algorithm table, click in the Algorithm field for the Stem column, which will carry the word stems extracted from the input data, and select English as the algorithm language.
Double-click the tLogRow component to open its Basic settings view, and select the Table option for better readable display of the Job execution result.