Skip to main content


Hides original data with random characters or figures to protect the actual data while having a functional substitute for occasions when it is not advisable to show sensitive real data.

tDataMasking reads a data set row by row and creates a structurally similar but inauthentic version of the data after having applied specific functions on data fields. It generates one row for each input row.

You will be able to use the functional substitute for purposes such as testing and training. When manipulating Personally Identifiable Information (PII) or Sensitive Personal Data (SPD), you might want to protect and mask this data.

The definition of sensitive data is broad and may differ from one country to the other or from one organization to the other. Basically, sensitive data can be personal information or business information which includes anything that poses a risk to the person or company in question.

Globally, Credit/Debit card data for example is considered to be sensitive. Sensitive data is any piece of information that can be used to identify or locate a person. A non-exhaustive list of personal sensitive data may include: first and last names, email addresses, addresses, Social Social Number (SSN), credit card numbers, bank account numbers, race, gender, date of birth, salary and geolocation combined with time.

For further information about personal sensitive data, see Personally Identifiable Information.

Also, business sensitive data may include trade secrets, acquisition plans, financial data and customer information, among other possibilities.

In local mode, Apache Spark 2.4.0 and later versions are supported.

This component is not shipped with your Talend Studio by default. You need to install it using the Feature Manager. For more information, see Installing features using the Feature Manager.

For more technologies supported by Talend, see Talend components.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!