This scenario describes a five-component Job that matches the family information entries in the main input file against those in a reference input file, and displays the exact matches and the rejected data in different tables on the console. The dynamic feature is leveraged to save the time of configuring individual columns in the schema of each component.
Drop two tFileInputDelimited components, a tJoin component, and two tLogRow components from the Palette onto the design workspace, and label them to better identify their roles in the Job, as shown above.
Connect the tFileInputDelimited component labelled Main_Input to the tJoin component, which is labelled Check, using a Row > Main connection.
Repeat the step above to connect the tFileInputDelimited component labelled Ref_Input to the tJoin component. This Row connection automatically appears as a lookup link.
Connect the tJoin component to the tLogRow component labelled Matches using a Row > Main connection. This link will gather the data of the exact matches.
Connect the tJoin component to the tLogRow component labelled Rejects using a Row > Inner join reject connection. This link will gather the rejected data.
Double-click the tFileInputDelimited component labelled Main_Input to display its Basic settings view.
The dynamic schema feature is only supported in Built-In mode and requires the input file to have a header row.
Click the [...] button next to the File Name/Stream field to browse to your main input file, and type in 1 in the Header field to define the first row as the header row.
In this use case, the main input file contains the following information:
FirstName;LastName;HouseNo;Street;City Gerald;Roosevelt;48;Fairview Avenue;Oklahoma City Benjamin;Harrison;27;Katella Avenue;Little Rock Bob;Clinton;11;Bowles Avenue;Raleigh James;Quincy;45;Cerrillos Road;Saint Paul Gerald;Harrison;27;Katella Avenue;Little Rock Harry;Madison;85;Santa Monica Road;Raleigh Helen;Roosevelt;48;Fairview Avenue;Oklahoma City Mary;Clinton;11;Bowles Avenue;Raleigh Cathey;Quincy;45;Cerrillos Road;Saint Paul John;Smith;64;Market Street;Helena
Click Edit schema to define the schema for this component.
In this use case, the main input file has five columns: FirstName, LastName, HouseNo, Street, and City. However, as we can leverage the advantage of the dynamic schema feature, we simply define two columns: one string type of column for the first names of people, and one dynamic column for the family information. To do so:
Click the [+] button to add two columns, and name them FirstName and FamilyInfo respectively.
Select String from the Type list for the FirstName column to retrieve the first name of each person on the name list.
Select Dynamic from the Type list for the FamilyInfo column to retrieve the rest information of each person on the name list: the last name, house number, street, and city, which all together will identify a family.
Click OK to propagate the schema and close the [Schema] dialog box.
Following steps similar to the above, define the properties for the tFileInputDelimited component labelled Ref_Input: the path to the reference input file, the header row, and the schema. This time, just define one dynamic column, FamilyInfo, to retrieve the four columns of the reference input file, which contains the following information:
LastName;HouseNo;Street;City Clinton;11;Bowles Avenue;Raleigh Quincy;45;Cerrillos Road;Saint Paul Smith;64;Market Street;Helena
Double-click the tJoin component to open its Basic settings view.
Click Edit schema to open the [Schema] dialog box to check the data structures of the input files and define the data you want to pass to the output components.
In this scenario, we want to pass both columns of the main input file, FirstName and FamilyInfo, to the output files, so simply copy the schema columns of the main input file by clicking the [->>] button. Then, click OK to validate the schema and close the dialog box.
In the Key definition area, click the [+] button to add one column to the list and then select the input column you want to match from the Input key attribute list and the reference column against which you want match the input column from Lookup key attribute list, FamilyInfo and row2.FamilyInfo respectively in this example.
Make sure that the Inner join (with reject output) check box is selected to define one of the outputs as inner join reject table.
In the Basic settings view of each tLogRow component, select the Table option to display the output information in table cells.