Procedure
-
Double-click tDataShuffling to display the
Basic settings view and define the component
properties.
- Click Sync columns to retrieve the schema defined in the input component.
-
In the Shuffling columns table, click the
[+] button to add four rows, and then:
-
in the Column, select the columns where data will be shuffled,
-
in the Group ID, select the group identifier for each column. The columns having the same group identifier are shuffled together.
In the above example, there are two groups of columns to be shuffled:-
Group ID 1: credit_card
-
Group ID 2: lname, fname and mi
-
-
Click the Advanced settings tab.
In the Partitioning columns table, click the [+] button to add one row.The Job will shuffle the original data rows sharing the same value for the partitioning columns.In the above example, the component is configured to apply the shuffling process to the rows sharing the same value for the country column.