How to read data effectively that pertains to an MDM entity with many foreign key fields

author
Talend Documentation Team
EnrichVersion
6.4
6.3
6.2
6.1
6.0
EnrichProdName
Talend MDM Platform
Talend Data Fabric
Talend Open Studio for MDM
task
Data Governance > Retrieving data
Data Governance > Third-party systems > MDM components
Data Quality and Preparation > Third-party systems > MDM components
Design and Development > Third-party systems > MDM components
EnrichPlatform
Talend Studio

How to read data effectively that pertains to an MDM entity with many foreign key fields

If an MDM entity has a large number of foreign key fields, it may take you a long time to query data records that pertain to the entity using the tMDMInput component.

This is because tMDMInput reads data in an MDM Hub directly, and the underlying query is to perform JOIN operations on all foreign key related tables. When there are a large number of tables and many data records exist within the tables, it will be time-consuming to execute such kind of queries.

Environment

This article applies to Talend MDM products, version 5.6.3 or higher.

Resolution

As a workaround, you can use tMDMViewSearch and tExtractXMLField. tMDMViewSearch enables you to limit the JOIN operations only to the tables you want to query, thus achieving a better performance.

  • tMDMViewSearch: This component reads data in an MDM Hub from a specific view, thus narrowing the range of data records to be queried. A View is one of the possible ways to query the MDM Hub.
  • tExtractXMLField: This component allows you to specify the fields of interest.

Since Release 6.2.0, a new component tMDMRestInput has been introduced to enable you to read data from the MDM Hub through the REST API with query language, and it can also be used to address this kind of issues.

For more information about the components, community users can refer to Talend Open Studio Components Reference Guide , and platform users can refer to Talend Components Reference Guide.

Example

The Job read_data_mdm_entity_fk_fields_job.zip available in the Downloads tab uses the tMDMViewSearch and tExtractXMLField components to query some data from the entity Person, and uses the tLogRow component to display the query result.

Prerequisites :

  • Make sure the data container read_data_mdm_entity_fk_fields_person.zip, available in the Downloads tab, and its corresponding data model Person already exist. In the Person data model, the Person entity has foreign key relationships with many other data entities, for example, Hobby, House, Car and Children.
  • Make sure the composite view of Person and Hobby already exists (read_data_mdm_entity_fk_fields_view in the Downloads tab).
  • Make sure you are an authorized user who already loaded some data into the relevant entities.

The figure below shows the data model Person and its entities, and the foreign key relationships between entities.

The figure below shows the composite view of Person and Hobby.

Do the following to complete the Job:

  1. In the tMDMViewSearch component, set the schema to receive the XML data. In this example, add one column Person of String type.
  2. Configure the component tMDMViewSearch to query data records from the specific view Browse_items_Person#AndHobby.
  3. Configure the component tExtractXMLField to extract the fields of interest. In this example, Id, Name of a person will be extracted.
  4. Save and execute the Job.

The extracted data records are displayed on the console.