Skip to main content

Reference data pattern

Availability-noteDeprecated
This section defines some best practices regarding reference data.

When creating MDM entities, you will often see the same pattern emerging, especially when dealing with reference data. For example, you may want to hold a list of titles instead of having a free text box for a title:

<title>		
 <titleId> Unique identifier for this title </titleId>
 <name> The actual master reference data value, e.g. ‘Mr’, ‘Mrs’ etc. </name>
 <description> Optional description of this value </description>
 <sourceXrefList>
  <sourceXrefItem>	0..many repeating list
   <sourceSystemFk> Optional FK to a source system entity </sourceSystemFk>
   <sourcePk> Optional: the PK of this data in a source system </sourcePk>
   <sourceLabel> This value as it is in source or a synonym. E.g. for ‘Mr’ in MDM, the value in a given source system might be ‘Mister’. Or it could just be a synonym for use in lookups / synonym indexes and not linked to a particular source system. </sourceLabel>
  </sourceXrefItem>
 </sourceXrefList>
</title>

Now consider that you also want to hold a list of genders:

<gender>		
  <genderId> Unique identifier for this title </genderId>
  <name> The actual master reference data value, e.g. ‘Male, ‘Female’ etc. </name>
  <description> Optional description of this value </description>
  <sourceXrefList>
    <sourceXrefItem>	0..many repeating list
      <sourceSystemFk> Optional FK to a source system entity </sourceSystemFk>
      <sourcePk> Optional: the PK of this data in a source system </sourcePk>
      <sourceLabel> This value as it is in source or a synonym. E.g. for ‘Male’ in MDM, the value in a given source system might be ‘M’. Or it could just be a synonym for use in lookups / synonym indexes and not linked to a particular source system. </sourceLabel>
    </sourceXrefItem>
  </sourceXrefList>
</gender>

You can notice a pattern. With many reference data entities, you would end up bloating your model and building many different integrations for the same repeated pattern. Instead, it is recommended to use a set of reusable reference data entities.

refDataList: a list of lists

Here the description would be the list name, for example 'title', 'gender', etc.

refDataItem: an entry in a list

This is the same structure as your title and gender structures above, but genericized and with the addition of a foreign key to a refDataList. This allows you to easily group all reference data of the same type together and provides some governance and control over the authorship process.

Integration layer considerations: the Talend Data Integration Jobs and services should look up the correct refDataItemId and ensure that the value is part of the correct list.

Talend MDM Web UI considerations: If you have a scenario where a user is able to insert or update records via the Talend MDM Web UI that use the reference data, you would generally only want to present the user with values from the correct list for a given relationship, not all possible reference data values. To do this, you would use a foreign key filter in the model. This is covered later in this article.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – let us know how we can improve!