This function randomly replaces the input value with one of the user-defined values.
This function is applied to Strings or numerical data types.
|The Randomly method randomly selects the value
from the list (or file). As a result, two similar input values can be masked
with the different output values.
The Consistently method ensures that two similar input values are masked with the same output value.
When using the Consistently method, the probability of generating duplicates can be calculated using the following formulas:
Using this approach, it is possible to calculate the probability to find a pair sharing the same value within a group.
For example, the probability that, in a group of
n people, two people have the same birthday is
This function requires an extra parameter.
The extra parameter can be:
The values must be stored in a String and
separated by commas, for example: "item1, item2,
item3, etc.". This function uses the
If you use the Apache Spark Batch or the Apache Spark Streaming version of the component, enter the prefix before the file path:
Paths to folders are not supported.
If the extra parameter is not set, the function returns an empty String or
In the following example, the masked value is one of the values set as extra parameters.
|Examples of a masked value