Replacing actual data with realistic values - 7.3

Data privacy

Version
7.3
Language
English (United States)
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Data privacy components
Data Quality and Preparation > Third-party systems > Data Quality components > Data privacy components
Design and Development > Third-party systems > Data Quality components > Data privacy components

Procedure

  1. Double-click tDataMasking to display the Basic settings view and define the component properties.
  2. If required, click Sync columns to retrieve the schema defined in the input component.
  3. Click the Edit schema button to open the schema dialog box.
    tDataMasking proposes one predefined read-only column as shown in the below capture.
    This column identifies by true or false if the output record is an original or a substitute record respectively.
  4. Move any of the input columns to the output schema if you want to show them in the results, click OK and accept to propagate the changes.
  5. In the Modifications table, click the [+] button to add four rows, and perform the following actions:
    • In the Input Column, select the columns which content you want to substitute.
    • In the Category column, select from the list the category the function you want to use to mask data belongs to.
    • In the Function column, select from the list the function you want to use to mask data.
    • When available, in the Parameter column, select from the list the method to be used by the function to mask data.
    • When available, in the Parameter column, enter a value, a pattern or a path to be used by the function to mask data.
    In this example, the Job will generate inauthentic credit card numbers, replace the first three letters of first names, replace last names with names from a local file and replace the local part in email addresses with X characters.
  6. Click the Advanced settings tab and select the Output the original row check box.
    The Job will add the original data rows to the substitute data.