Double-click tPigLoad to open its
Basic settings view.
Click the [...] button next to Edit schema to open the [Schema] dialog box.
- Click the [+] button to add three columns according to the data structure of the input file: Name (string), Country (string) and Age (integer), and then click OK to save the setting and close the dialog box.
- Click Local in the Mode area.
- Fill in the Input file URI field with the full path to the input file.
- Select PigStorage from the Load function list, and leave rest of the settings as they are.
Double-click tPigDistinct to open its
Basic settings view, and click
Sync columns to make sure that the
input schema structure is correctly propagated from the preceding
This component will remove any duplicates from the data flow.