Configuring the masking operations - 7.3

Data privacy

Version
7.3
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Data privacy components
Data Quality and Preparation > Third-party systems > Data Quality components > Data privacy components
Design and Development > Third-party systems > Data Quality components > Data privacy components
Last publication date
2024-04-03

The alpha_values.csv file contains the allowed alphabetic values: all letters in the A to Z range (minus S, L, O, I, B, Z). The alphanum_values.csv file contains the allowed alphanumeric values: the values from alpha_values.csv and digits.

Before you begin

  • You retrieved the alpha_values.csv and alphanum_values.csv files from the Downloads tab in the left panel of this page.
  • You defined context variables to the alpha_values.csv and alphanum_values.csv files. For further information, see the online publication about how to define context variables for a Job on Talend Help Center. For further information, see How to define context variables for a Job.

Procedure

  1. Double-click tPatternMasking to display its Basic settings view in the Component tab.
  2. If required, click Sync columns to retrieve the schema defined in the input component.
  3. Click the Edit schema button to open the schema dialog box.

    tPatternMasking adds a read-only column to the output schema.

    The ORIGINAL_MARK column labels output records:

    • Original records are labeled with the true label.
    • Substitute records are labeled with the false label.
  4. In the Modifications table, click the [+] button to add ten rows for configuring the data masking operations.
    The first nine rows define the masking operation for each of the first nine characters in the input values. The last row define the masking operation for the last two characters in the input values.
    The dash is used as a separator in the input values. You do not need to configure masking operations for separators because the masked output has the same separators as the input values.
  5. Configure the masking operations for the first, fourth and seventh characters that appear in the input:
    1. Click the Column to mask field of the first row and select the column that contains the data to be masked.
      In this example, select MBI.
    2. From the Field type field, select Interval as the field type the data belongs to and enter the range of authorized numeric values in the Values field.
      In this example, the purpose is to mask the first character with a digit in the 1 to 9 range ("1,9"). The fourth and seventh characters will be masked with a digit in the 0 to 9 range ("0,9").
    3. Apply the same configuration to the fourth and seventh rows of the Modifications table.
  6. Configure the masking operations for the second, fifth, eighth and ninth characters that appear in the input:
    1. Click the Column to mask field of the second row and select the column that contains the data to be masked.
    2. From the Field type field, select Enumeration from file.
    3. Click the Values field and press Ctrl + Space to select the variable for the file that contains the authorized values.
      In this example, select the variable for the file that contains the authorized alphabetic values.
    4. Apply the same configuration to the fifth, eighth and ninth row of the Modifications table.
  7. Configure the masking operations for the third and sixth characters that appear in the input:
    1. Click the Column to mask field of the third row and select the column that contains the data to be masked.
    2. From the Field type field, select Enumeration from file.
    3. Click the Values field and press Ctrl + Space to select the variable for the file that contains the authorized values.
      In this example, select the variable for the file that contains the authorized alphanumeric values.
    4. Apply the same configuration to the sixth row of the Modifications table.
  8. Configure the masking operations for the last two characters that appear in the input:
    1. Click the Column to mask field of the last row and select the column that contains the data to be masked.
    2. From the Field type field, select Interval as the field type the data belongs to and enter "0,99" for the range of authorized numeric values in the Values field.
      In this example, the purpose is to mask the characters with two digits in the 0 to 9 range.
      To mask each of the two characters separately, you can add a row to the Modifications table, define two masking operations and enter "0,9" for the range of authorized numeric values.