This scenario describes a three-component Job which retrieves a file from an HTTP website, reads data from the fetched file and displays the data on the console.
Drop a tFileFetch, a tFileInputDelimited and a tLogRow onto your design workspace.
Link tFileFetch to tFileInputDelimited using a Trigger > On Subjob Ok or On Component Ok connection.
Link tFileInputDelimited to tLogRow using a Row > Main connection.
Double-click tFileFetch to open its Basic settings view.
Select the protocol you want to use from the list. Here, http is selected.
In the URI field, type in the URI where the file to be fetched can be retrieved from. You can paste the URI directly in your browser to view the data in the file.
In the Destination directory field, browse to the folder where the fetched file is to be stored. In this example, it is D:/Output.
In the Destination filename field, type in a new name for the file if you want it to be changed. In this example, new.txt.
If needed, select the Add header check box and define one or more HTTP request headers as fetch conditions. For example, to fetch the file only if it has been modified since 19:43:31 GMT, October 29, 1994, fill in the Name and Value fields with "If-Modified-Since" and "Sat, 29 Oct 1994 19:43:31 GMT" respectively in the Headers table. For details about HTTP request header definitions, see Header Field Definitions.
Double-click tFileInputDelimited to open its Basic settings view.
In the File name field, type in the full path to the fetched file which had been stored locally.
Click the [...] button next to Edit schema to open the [Schema] dialog box. In this example, add one column output to store the data from the fetched file.
Leave other settings as they are.