Masking Medicare beneficiary identifiers - 7.3

Data privacy

Version
7.3
Language
English
Product
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > Data Quality components > Data privacy components
Data Quality and Preparation > Third-party systems > Data Quality components > Data privacy components
Design and Development > Third-party systems > Data Quality components > Data privacy components
Last publication date
2024-04-03

This scenario applies only to Talend Data Management Platform, Talend Big Data Platform, Talend Real Time Big Data Platform, Talend Data Services Platform, Talend MDM Platform and Talend Data Fabric.

For more technologies supported by Talend, see Talend components.

Using the tPatternMasking component, you can replace personally identifiable information, such as Medicare Beneficiary Identifiers (MBI), with realistic values in a consistent manner.

A MBI uniquely identifies a beneficiary of the US federal health insurance program. It consists of 11 characters, excluding dashes, and uses the following pattern:
  • A digit in the 1 to 9 range
  • A letter in the A to Z range (minus S, L, O, I, B, Z)
  • A digit or a letter in the A to Z range (minus S, L, O, I, B, Z)
  • A digit in the 0 to 9 range
  • A letter in the A to Z range (minus S, L, O, I, B, Z)
  • A digit or a letter in the A to Z range (minus S, L, O, I, B, Z)
  • A digit in the 0 to 9 range
  • A letter in the A to Z range (minus S, L, O, I, B, Z)
  • A letter in the A to Z range (minus S, L, O, I, B, Z)
  • A digit in the 0 to 9 range
  • A digit in the 0 to 9 range

For example, 1EG4-TE5-MK73 is a valid MBI.

This scenario describes a Job which uses the following components:

  • the tFixedFlowInput component generates MBIs;

  • the tPatternMasking component replaces the original MBIs with random digits or letters from a set of named values, or a random digit from a specified range;

  • the tLogRow component outputs the substitute data set.