Architecture - Master Data Management 5.6

Irshad Burtally
Talend Data Fabric
Talend MDM Platform
Talend Open Studio for MDM
Data Quality and Preparation
Installation and Upgrade
Administration and Monitoring
Design and Development
Data Governance

Architecture - Master Data Management 5.6

This article describes the high level and physical architecture for Talend Master Data Management product. It is only provided as a guidance for architects to instill some direction while designing the architecture of an MDM platform. It is important to build an architecture that meets the functional and non-functional requirements of the MDM project. Hence, the information provided below is only a starting point.

High Level Architecture

The figure below shows the logical architecture for the Talend Master Data Management platform. The backbone is the active data model, with embedded semantics, search engine and hierarchies, role based access, rules based controls, and audit trail.

The back end connects to the systems and applications, in batch or real time, and connects to third party data, for example for address validation, geocoding and B2B Exchanges.

The front end are dedicated to the implementation and administration team, and to the data stewards, through web based interface and workflows. Lines of business (LOB) users may access the MDM indirectly through their business applications, or through custom web or mobile application that consume our generated Master data services.

Key Capabilities:

  • Multi-domain: Master any Party (customer, supplier, vendors) or Product domain using a single technology
  • Flexible Integration: Connects to 800+ data sources real-time & batch – minimizing applications coding
  • End to end Data Quality: Provides data profiling, standardization and enrichment to improve information accuracy
  • Persuasive Governance: Complete BPM engine and Data Stewardship console to enforce Governance processes

The Talend MDM product leverages the capabilities of the Talend Data Management Platform product.

For more information about the architecture of Talend Data Management 5.6, see Architecture - Talend Data Management 5.6.

Talend Software Components

The Talend Platform for MDM is made up of four Talend products:

  • Talend Data Integration
  • Talend Data Quality
  • Enterprise Service Bus
  • Master Data Management

The components for Master Data Management are:

  • Talend MDM Server
  • Talend MDM Server Storage
  • MDM Perspective in Studio

Talend MDM Server

The Talend MDM Server, also known as the MDM Engine or MDM Runtime, exposes a Web UI and a web service API for accessing and handling master data. The web service API is leveraged by Talend MDM components within the studio to provide rich data integration functionality. The purpose of the Talend MDM Server is:
  • Master Data Management Server
  • Web UI for MDM Governance tasks
  • MDM Engine
  • Master Data Indexing
  • Configuration of MDM Storage layer
  • Integrated workflow
Technical Details:
  • Server component
  • Can be many instances in a single Talend environment, more typically one per environment – runtime licence dependant
  • Runs in a Talend modified version of JBoss (should only be used for Talend MDM)
  • JBoss / MDM are installed by the Talend Installer
  • BPM can run in the same or a different runtime (Talend Platform Universal only)
  • The MDM (JBoss) server is typically run as a Service / Daemon

Talend MDM Server Storage

The Talend MDM Server Storage component can be one of the supported databases to hold Master Data, Staging Data, MDM Journal, MDM System Data, etc.

Technical Details:

  • Server component
  • Number of relational databases supported
  • One set of Databases / Schemas required per MDM server
  • Database can be very large, depending on the volume of Master Data records

MDM Perspective in Talend Studio

The MDM perspective in the Talend Studio enables the follow:

  • Develop MDM Model
  • Configure Data Container
  • Configure Events through Process and Triggers. Processes can be DI Jobs
  • Configuration of views, roles, match rule, etc.
  • Develop business process workflow
  • Deploy MDM artifacts to the Talend MDM Server
  • Synchronize MDM artifacts between the Talend Studio and the Talend MDM Server

Physical Architecture For MDM

Talend recommends that customers plan at least 3 environments i.e. a Development, Test and Production environments. The physical architecture for a typical setup for each of these environments are described below.

The architecture team must perform a sizing exercise based on the functional and non-functional requirements of the project(s) and design the correct architecture for development, test and production environments that matches the needs of the business.

Typical Development Environment Architecture

The diagram below shows a typical development environment for a small team of 5-10 developers. This architecture is an extension of the architecture of Talend Data Management 5.6. Hence, it will only highlight the extra requirements for MDM.

For more information about the architecture of Talend Data Management 5.6, see Architecture - Talend Data Management 5.6.

Refer to the installation requirements in the Talend Data Fabric Installation Guide for details on the memory and disk required for installation. In the above architecture, additional Execution Server(s) may be needed if there are many tasks (i.e. jobs), services and routes running. The Execution Server will be running the Talend Runtime components.

Refer to the Talend Data Fabric Installation Guide for details on supported OS, Java, Database Engines, and minimum processor, memory, and disk requirements.

Note: A server may be hosting several components that make up the platform.

Workstation/Server Role


Typical Sizing

Execution Server

This is where all Talend jobs, services and routes will run.

The jobs, services and routes will be deployed through Talend Administration Center.

OS: Windows/Linux (See Installation Guide)

CPU: 8 Cores Minimum

RAM: 32+ GB RAM (16 GB Minimum)

Disk Size: 100 GB

MDM Server

The MDM Server will host the following:
  • MDM Server
  • Talend DQ Portal

OS: Windows/Linux (See Installation Guide)

CPU: 8 Cores Minimum


Disk Size: 100 GB