Masking data - Cloud

Talend Cloud Data Preparation User Guide

author
Talend Documentation Team
EnrichVersion
Cloud
EnrichProdName
Talend Cloud
task
Data Quality and Preparation > Cleansing data
EnrichPlatform
Talend Data Preparation

When manipulating sensitive data, such as names, addresses, credit card or social security numbers, you might want to mask this data.

To protect the original data while having a functional substitute, you can use the Mask data (obfuscation) function.

Procedure

  1. Select the column on which you want to apply the data masking.
  2. In the Functions panel, type Mask data (obfuscation) and click the result to open the options for the associated function.
  3. In the Masking function drop-down list, select your masking routine, semantic masking for example.
  4. From the Masking mode drop-down list, select Repeatable.
    Selecting Repeatable instead of the default Random mode allows you to select a Seed. This seed is what will define the output of the masking function, ensuring that identical source values will always be output as the same masked values.
  5. Leave the default Seed value.
  6. Click Submit.

Results

The data in the column has been replaced by random but usable substitutes. Depending on the semantic type of the column on which you use the Mask data (obfuscation) function, the effect will vary. For more information, see Data Masking effects