Masking data - 7.3

Talend Data Preparation User Guide

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Data Preparation
Content
Data Quality and Preparation > Cleansing data
Last publication date
2023-11-28

When manipulating sensitive data, such as names, addresses, credit card or social security numbers, you might want to mask this data.

To protect the original data while having a functional substitute, you can use the Mask data (obfuscation) function.

Procedure

  1. Select the column on which you want to apply the data masking.
  2. In the Functions panel, type Mask data (obfuscation) and click the result to open the options for the associated function.
  3. In the Masking function drop-down list, select your masking routine, semantic masking for example.
  4. From the Masking mode drop-down list, select Repeatable.
    Selecting Repeatable instead of the default Random mode allows you to select a Seed. This seed is what will define the output of the masking function, ensuring that identical source values will always be output as the same masked values.
  5. Leave the default Seed value.
  6. Click Submit.

Results

The data in the column has been replaced by random but usable substitutes. Depending on the semantic type of the column on which you use the Mask data (obfuscation) function, the effect will vary. For more information, see Data Masking effects