Configuring the tDataShuffling component

Procedure

Double-click tDataShuffling to display the Basic settings view and define the component properties.
Click Sync columns to retrieve the schema defined in the input component.
In the Shuffling columns table, click the [+] button to add four rows, and then:
- in the Column, select the columns where data will be shuffled,
- in the Group ID, select the group identifier for each column. The columns having the same group identifier are shuffled together.
In the above example, there are two groups of columns to be shuffled:
- Group ID 1: credit_card
- Group ID 2: lname, fname and mi
The Job will replace credit card numbers within the credit_card column with values from different rows. It will also keep last names, first names and middle initial values, from the lname, fname and mi columns together and replace them with values from different rows.
Click the Advanced settings tab.

In the Partitioning columns table, click the [+] button to add one row.

The Job will shuffle the original data rows sharing the same value for the partitioning columns.

In the above example, the component is configured to apply the shuffling process to the rows sharing the same value for the country column.

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!