MDM query language and REST data access

EnrichVersion
6.4
EnrichProdName
Talend MDM Platform
Talend Data Fabric
task
Data Governance > Consolidating data
Data Governance > Retrieving data
EnrichPlatform
Talend MDM Server
Talend MDM Web UI

MDM query language and REST data access

Talend MDM proposes a REST API that enables users to perform CRUD (Create/Request/Update/Delete) operations on MDM records in a more convenient manner.

This article focuses mainly on the read and write operations of the MDM REST API, and then gives examples of basic and more complex queries that you can include in REST API calls.

This article applies to all Talend Platform products with MDM 6.0 and later.

REST API

Talend MDM REST API is a modern, developer-friendly API that simplifies the interfaces to MDM, allowing developers to interact with MDM data more easily and directly.

Talend MDM Web UI provides access to the MDM REST resources documentation, which allows you to refer to the query language syntax and related parameters and test the execution of operations.

You can also integrate data API and transactions API. For more information, see Transactions API integration.

Create a record

Create a new record in the specified data container.

Request

POST /data/{containerName}

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}?container={containerType}

Query parameters

  • containerName: This is a String value which specifies the name of the data container in which you want to create a new data record.
  • containerType: This is a String value which represents the type of the data container. It is either ‘MASTER’ (default) or ‘STAGING’.

Headers

  • Content-Type: application/xml or text/xml
  • Authorization: Basic Authentication scheme

Request Body

An XML representation of the data record you want to create.

Response Body

No content.

Response Code

200 - If the operation is executed successfully, the data record will be written into the specified data container.

Read records

Read data records from the specified data container.

You can also retrieve a snapshot of a specific MDM record and navigate through the record history. For more information, see the article MDM Time Machine.

Request

PUT /data/{containerName}/query

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}/query?container={containerType}

Query parameters

  • containerName: This is a String value which specifies the name of the data container from which you want to read data records.
  • containerType: This is a String value which represents the type of the data container. It is either ‘MASTER’ (default) or ‘STAGING’, which is not case sensitive.

Request Body

For more information about the query language, see SELECT clauses.

Headers

  • Content-Type: It must be application/json.
  • Authorization: Basic Authentication scheme
  • Accept: application/xml or application/json. The Accept header tells the server to return either JSON or XML content. By default, the result is returned in JSON format.

Response Body

The data record(s) you queried from the specified data container.

Update a record

Update a data record completely in the specified data container.

Request

PUT /data/{containerName}

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}?container={containerType}

Query parameters

  • containerName: This is a String value which specifies the name of the data container in which you want to update a data record.
  • containerType: This is a String value which represents the type of the data container. It is either ‘MASTER’ (default) or ‘STAGING’.

Request Body

An XML representation of the new data record you want to use to replace the existing one.

The id element(s) in the request body determine which record is to be updated.

Headers

  • Authorization: Basic Authentication scheme

Response Body

No content.

Response Code

200 - If the operation is executed successfully, the whole data record will be replaced by the provided input. If you want to preserve some existing values, use the partial update operation instead. For more information, see Update a record partially.

Update a record partially

Update a data record partially in the specified data container.

Request

PATCH /data/{containerName}

Although "PATCH" is a valid HTTP verb, not all REST tools support it. For more information, see http://stackoverflow.com/questions/4260084/testing-tools-for-rest-that-support-http-patch.

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}?container={containerType}

Query parameters

  • containerName: This is a String value which specifies the name of the data container in which you want to update a data record partially.
  • containerType: This is a String value which represents the type of the data container. It is either ‘MASTER’ (default) or ‘STAGING’.

Request Body

An XML representation of the new data record you want to use to replace the existing one.

The id element(s) in the request body determine which record is to be updated.

Headers

  • Authorization: Basic Authentication scheme

Response Body

No content.

Response Code

200 - If the operation is executed successfully, the data record which is identified by the id element(s) will be updated partially according to the provided input.

Delete a record (by id)

Delete a data record by the specified id.

Request

DELETE /data/{containerName}/{type}/{id}

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}/{type}/{id}

By default, an operation of deleting a record by id will generate an entry in the UpdateReport.

However, to avoid generating an entry in the UpdateReport, add the query parameter 'updateReport=false' in the request URL. For example, http://localhost:8180/talendmdm/services/rest/data/Customer/Address/1?updateReport=false.

Query parameters

  • containerName: This is a String value which specifies the name of the data container from which you want to delete a data record.
  • type: This is a String value which specifies the name of the entity from which you want to delete a data record.
  • id: Specifies the id of the data record to be deleted. For composite ids, you can use a dot-separated value. For example, to delete an Address instance with id=1 in the Customer data container, use http://localhost:8180/talendmdm/services/rest/data/Customer/Address/1,and to delete an Address instance with id "1" and "2", use http://localhost:8180/talendmdm/services/rest/data/Customer/Address/1.2.

Request Body

No content.

Headers

  • Authorization: Basic Authentication scheme

Response Body

No content.

Response Code

200 - If the operation is executed successfully, the data record which is specified by the id element(s) will be deleted.

Delete a record (by query)

Delete one or more records that match the provided query.

Request

PUT /data/{containerName}/delete

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}/delete

The operations of deleting records by query will not generate entries in the UpdateReport.

Query parameters

  • containerName: This is a String value which specifies the name of the data container in which delete the queried data record(s).

Request Body

A query in JSON. For more information, see SELECT clauses.

For example, to delete all instances of Type1, use this query:

{
  "select": {
    "from": ["Type1"]
  }
}

To delete Type1 instances where Type1/value1 is greater than 1, use this query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "gt": [
        {"field": "Type1/value1"},
        {"value": "1"}
      ]
    }
  }
}

Headers

  • Authorization: Basic Authentication scheme

Response Body

No content.

Response Code

200 - If the operation is executed successfully, all records that match the provided query will be deleted.

Get all documents unique ID

Get the unique ids of all documents.

Request

GET /data/{containerName}/documents/uniqueIds

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}/documents/uniqueIds?type={typeName}

Query parameters

  • containerName: This is a String value which specifies the name of the data container from which you want to get the unique ids of all documents.
  • containerType: This is a String value which represents the type of the data container. It is either ‘MASTER’ (default) or ‘STAGING’.
  • typeName: This is a String value which specifies the name of the entity.

Headers

  • Authorization: Basic Authentication scheme

Request Body

No content.

Response Body

An array of document unique ids of all entities or the specified entity.

For example, use http://localhost:8180//talendmdm/services/rest/data/Product/documents/uniqueIds?type=Product to return the document unique ids of only the Product entity:

[
  "Product.Product.1",
  "Product.Product.2"
]

Response Code

200 - If the operation is executed successfully, the documents unique ids of all entities or the specified entity are returned in an array.

Get an XML string of a document

Get an XML string of a document.

Request

GET /data/{containerName}/documents/{uniqueId}

Request URL

http://{serverurl}/talendmdm/services/rest/data/{containerName}/documents/{uniqueId}?encoding={encoding}

Query parameters

  • containerName: This is a String value which specifies the name of the data container from which you want to get an XML string of a document.
  • uniqueId: This is a String value which represents the unique id of the document.
  • encoding: It defaults to UTF-8 and represents the char set of the result.

Headers

  • Authorization: Basic Authentication scheme

Request Body

No content.

Response Body

A string which represents the record matching the provided ID.

For example, using http://localhost:8180/talendmdm/services/rest/data/Product/documents/Product.Product.S1?encoding=UTF-8 will return:

<ii><c>Product</c><dmn>Product</dmn><dmr/><sp/><t>1484275339047</t><taskId>null</taskId><i>S1</i><p><Product><Id>S1</Id><Name>Shirt</Name><Description>A Shirt</Description><Features><Sizes><Size>Small</Size></Sizes><Colors><Color>White</Color></Colors></Features><Price>11.99</Price><Family>[1]</Family><OnlineStore>Talend Shop@@http://www.cafepress.com/Talend.231035933</OnlineStore><Stores></Stores></Product></p></ii>
.

Response Code

200 - If the operation is executed successfully, the record that matches the provided ID will be returned as a string.

Query language

In the REST API calls, you can use the query language to narrow down the data records of interest, order the query results or implement other functions according to your needs.

SELECT clauses

You can use basic SELECT clauses in REST API calls.

Select all instances of an entity

You can query all instances of an entity. For example, to return all instances of "Type1" records, use the query:

{
  "select": {
    "from": ["Type1"]
  }
}

Note that the "from" element is an array type, allowing you to select multiple types. This does not mean the query can return all Type1 and Type2 instances. The array is used for joins.

Paging: start and limit

Paging may be needed in some use cases such as building a table. This can be done with "start" and "limit" elements.

For example, to return results which start with the first instance of "Type1" (index is 0 based), together with the next 10 instances, use the query:

{
  "select": {
    "from": ["Type1"],
    "start": 0,
    "limit": 10
  }
}
Pay attention to the following:
  • Standard execution of such queries implies a creation of a list of size limit on the server side. If you specify a large value, it may lead to performance issues or an OutOfMemoryError. Therefore, make sure to use paging with relatively small values of limit.

  • A limit of Integer.MAX_VALUE (2^31 -1) is considered by MDM code as "no limit". Make sure to use a streaming strategy instead of a list of results. However, this means a limit of (2^31 - 2) will try to create a list of that size (usually causing OutOfMemoryError).

Select field values

Instead of querying the whole record, you can query only one field in the record.

For example, to return values of "id" field of "Type1" type, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/id"}
    ]
  }
}

To select more than one field, repeat the "field" element. For example, to return values of "id", "value1" and "value2" for each record of "Type1", use this query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/id"},
      {"field": "Type1/value1"},
      {"field": "Type1/value2"}
    ]
  }
}

Aliasing

For naming or client reasons, you may want to alias the returned element. Aliasing can help differentiate values in case of name collision. This can be achieved with the "alias" element.

For example, to return all "id" values in all Type1 instances, but instead of wrapping values inside an element "id", the wrapping element will be named "a", use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/id"},
      {
        "alias": [
          {"name": "a"},
          {"field": "Type1/id"}
        ]
      }
    ]
  }
}

For example, use the query to return results with "Type1/containedField1/value1" wrapped in "v1" and "Type1/containedField2/value1" in "v2", thus avoiding the possible naming collision that both "Type1/containedField1/value1" and "Type1/containedField2/value1" would be returned wrapped in a "value1" element:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {
        "alias": [
          {"name": "v1"},
          {"field": "Type1/containedField1/value1"}
        ]
      },
      {
        "alias": [
          {"name": "v2"},
          {"field": "Type1/containedField2/value1"}
        ]
      }
    ]
  }
}

Distinct field values

You can use the "distinct" keyword to get all distinct values for an entity field.

For example:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"distinct": {"field": "Type1/value1"}}
    ]
  }
}

This query will return all distinct values for "value1" in type "Type1".

You can also alias a result using the "alias" element. For example:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {
        "alias": [
          {"name": "a0"},
          {"distinct": {"field": "Type1/value1"}}
        ]
      }
    ]
  }
}

This returns all distinct values for "value1" but inside an element named "a0".

Count

You can count how many results are returned by a query.

For example, to count how many Type1 instances are available, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"count": {}}
    ]
  }
}

You can also combine the count operation with conditions to count how many instances match the specified criteria.

For example, to return how many instances of Type1 have a value1 greater than 1, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"count": {}}
    ],
    "where": {
      "gt": [
        {"field": "Type1/value1"},
        {"value": "1"}
      ]
    }
  }
}

Count may also be used in order_by clauses.

Maximum and minimum

You can select the maximum and/or minimum value for a field in an entity.

For example, to retrieve the smallest value of "id" in the entity Type1, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"min": {"field": "Type1/id"}}
    ]
  }
}

For example, to retrieve the largest value of "id" in Type1, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"max": {"field": "Type1/id"}}
    ]
  }
}

For example, to retrieve both the largest and the smallest values of "id" in Type1, use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"max": {"field": "Type1/id"}},
      {"min": {"field": "Type1/id"}}
    ]
  }
}

Order by field's value

You can use as many "order by" clauses as needed in a query either in descending or ascending order to set the order in the returned result.

For example, to select all instances of Type1 and order results by id (from the largest to the smallest) and value1 (from the smallest to the largest), use the query:

{
  "select": {
    "from": ["Type1"],
    "order_bys": [
      {
        "order_by": [
          {"field": "Type1/id"},
          {"direction": "DESC"}
        ]
      },
      {
        "order_by": [
          {"field": "Type1/value1"},
          {"direction": "ASC"}
        ]
      }
    ]
  }
}

Order by field value occurrence

You can also order by field value occurrence so that the most common value is returned first down to the least common value (in case of DESC direction).

Here is one query example to return the most common value for "value1" down to the least common:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/value1"},
    ],
    "order_bys": [
      {
        "order_by": [
          {"count": {"field": "Type1/value1"}},
          {"direction": "DESC"}
        ]
      }
    ]
  }
}

You can also include "limit" in the query to obtain the top N common values.

Here is one query example to obtain the top 10 values:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/value1"},
    ],
    "order_bys": [
      {
        "order_by": [
          {"count": {"field": "Type1/value1"}},
          {"direction": "DESC"}
        ]
      }
    ],
    "limit": 10
  }
}

Conditions

Value comparisons (=, <, <=, >, >=)

For example, use the query below to return all instances of Type1 where "id" equals 1:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"field": "Type1/id"},
        {"value": "1"}
      ]
    }
  }
}

Bear in mind the following:

  • There is no typing for "value", and the actual type of the value is decided by the MDM server based on the field type. If "id" is a string, MDM interprets the value as a string; if "id" is an integer, MDM interprets the value as an integer. This is quite important since the value must be correct for element type, for example, Type1/id = "abc" is incorrect if id is of number type.
  • The structure is always "operand": { field, value }.

The following operands are available for use:

Query language operand

Meaning

eq

equals

gt / gte

greater than/greater than or equals

lt / lte

less than / less than or equals

startsWith

starts with

contains

contains

Index based condition

You may specify an index for your condition in case the field is a repeatable element. The index is 0-based.

For example, to return all instances of Type1 where the second value of "list" equals "1", use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"index": [
          {"field": "Type1/list"},
          1
        ]},
        {"value": "1"}
      ]
    }
  }
}

Conditions between fields

You may use conditions between different fields. Only "eq" (EQUALS) is currently supported for such comparisons.

For example, to return the instances of Type1 where "id" equals "value1", use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"field": "Type1/id"},
        {"value": {"field": "Type1/value1"}}
      ]
    }
  }
}

Logic operators

To allow more complex queries, boolean operators (and/or/not) may be included. The structure is similar to what is previously presented "operator": { left, right } where left and/or right can be logic operators too.

For example, to return all Type1 instances where "id = 1 OR value1 = 0", use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "or": [
        {
          "eq": [
            {"field": "Type1/id"},
            {"value": "1"}
          ]
        },
        {
          "eq": [
            {"field": "Type1/value1"},
            {"value": "0"}
          ]
        }
      ]
    }
  }
}

A binary logic operator (and/or) must have 2 and only 2 children. For example, you can write a condition with "a OR b OR c":

{
  "select": {
    "from": ["Type1"],
    "where": {
      "or": [
        {
          "eq": [
            {"field": "Type1/id"},
            {"value": "1"}
          ]
        },
        {
          "or": [
            {
              "eq": [
                {"field": "Type1/id"},
                {"value": "2"}
              ]
            },
            {
              "eq": [
                {"field": "Type1/value1"},
                {"value": "4"}
              ]
            }
          ]
        }
      ]
    }
  }
}

You can also use mixed logic operators. There is no ambiguity in the evaluation order of conditions.

For example, to return all Type1 instances where "(id = 1 OR (id = 2 and value1 = 4))", use the query which mix both and and or:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "or": [
        {
          "eq": [
            {"field": "Type1/id"},
            {"value": "1"}
          ]
        },
        {
          "and": [
            {
              "eq": [
                {"field": "Type1/id"},
                {"value": "2"}
              ]
            },
            {
              "eq": [
                {"field": "Type1/value1"},
                {"value": "4"}
              ]
            }
          ]
        }
      ]
    }
  }
}  

Not operator

The Not operator is supported and follows the structure "not": { condition }.

For example, to return all Type1 instances where "id is not equal to 1", use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "not": {
        "eq": [
          {"field": "Type1/id"},
          {"value": "1"}
        ]
      }
    }
  }
}

You can also use the Not operator in a more complex condition. For example:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "not": {
        "or": [
          {
            "eq": [
              {"field": "Type1/id"},
              {"value": "2"}
            ]
          },
          {
            "eq": [
              {"field": "Type1/id"},
              {"value": "4"}
            ]
          }
        ]
      }
    }
  }
}

Full text (field scope)

The query language includes support for full text search.

For example, to perform a full text search on the field "Type1/id" with the value "1", use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "full_text": [
        {"field": "Type1/id"},
        {"value": "1"}
      ]
    }
  }
}

The full text query also works on subelements. For example, if you have the following type structure:

Type1
  * id
  * element
    * value1
    * value2

Then the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "full_text": [
        {"field": "Type1/element"},
        {"value": "1"}
      ]
    }
  }
}

Is equivalent to:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "or": [
        {
          "full_text": [
            {"field": "Type1/element/value1"},
            {"value": "1"}
          ]
        },
        {
          "full_text": [
            {"field": "Type1/element/value2"},
            {"value": "1"}
          ]
        }
      ]
    }
  }
}

Full text (entity scope)

If you do not want to specify any field but want to look for a value in the entity, the following query gives an example of how to achieve this:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "full_text": [
        {"value": "1"}
      ]
    }
  }
}

If you omit the field, MDM performs full text search on all Type1 fields.

Metadata fields

MDM adds "metadata" fields to the records it manages. These fields are not present in the user data model but can be used in the query.

Supported metadata fields are:

  • taskId (also known as groupId)
  • timestamp (last modification time)
  • groupSize

For the staging storage, there are additional metadata fields:

  • Staging error
  • Staging source
  • Staging status
  • Staging block key
  • Staging has task

    Note that the field "Staging has task" indicates whether a Talend Data Stewardship task is linked with the record.

If the query contains staging-only fields, you may expect an error that indicates the incompatible expressions. The incompatible expressions are checked in selected fields, conditions, order by and join expressions.

Get metadata field values

To get a metadata field, use "metadata" instead of "field" in the selected fields as shown in the following example:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"metadata": "timestamp"},
      {"metadata": "task_id"},
      {"metadata": "group_size"},
      {"metadata": "staging_error"},
      {"metadata": "staging_source"},
      {"metadata": "staging_status"},
      {"metadata": "staging_blockkey"},
      {"metadata": "staging_hastask"}
    ]
  }
}

You can also mix fields from the entity with fields from the record:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"metadata": "timestamp"},
      {"metadata": "task_id"},
      {"field": "Type1/id"}
    ]
  }
}

You can also use the "distinct" keyword for metadata fields. For example, to return all distinct values of "taskId" wrapped in an element named "distinctTaskId", use the query:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {
        "alias": [
          {"name": "distinctTaskId"},
          {"distinct": {"metadata": "task_id"}}
        ]
      }
    ]
  }
}

Conditions involving metadata fields

You can involve metadata fields in conditions. As for the selected fields, just use "metadata" instead of "field". For example:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"metadata": "task_id"},
        {"value": "1"}
      ]
    }
  }
}

Cache

MDM can cache query results. In this case, it keeps the query in cache and serves the cached result every time this query is executed.

For example, to cache a query result, use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"field": "Type1/id"},
        {"value": "1"}
      ]
    },
    "cache": true
  }
}

By default, "cache" is false, but this query is also valid:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"field": "Type1/id"},
        {"value": "1"}
      ]
    },
    "cache": false
  }
}

A cached result can have at most 50 records. When the query yields too many results, the following log will appear in the MDM log at DEBUG level: Query is yielding more results than cache is allowed to keep, bypassing cache for query.

Join to other types

Joins are a way to retrieve data from another type following the foreign keys.

For example, you can retrieve the Address's street element by following the FK between Person and Address types.

A simple example of join would be:

{
  "select": {
    "from": [
      "Type1",
      "Type2",
    ],
    "fields": [
      {"field": "Type1/id"},
      {"field": "Type2/value1"}
    ],
    "joins": [
      {
        "from": "Type1/fk2",
        "on": "Type2/id"
      }
    ]
  }
}

This query returns all Type1's "id" element as well as the "value1" element located in Type2. The join is performed using the "fk2" in Type1.

You may chain joins in the query too. For example:

{
  "select": {
    "from": [
      "Type1",
      "Type2",
      "Type3"
    ],
    "fields": [
      {"field": "Type1/id"},
      {"field": "Type2/value1"},
      {"field": "Type3/value1"}
    ],
    "joins": [
      {
        "from": "Type1/fk2",
        "on": "Type2/id"
      },
      {
        "from": "Type2/fk3",
        "on": "Type3/id"
      }
    ]
  }
}

This query displays "id" from Type1, "value1" from Type2 and "value1" from Type3. The value for Type3's "value1" is obtained via a join between Type1, Type2 and Type3.

History related operators - “as of" operator

The "as_of" operator allows you to retrieve a record as it was at any date.

For example:

{
  "select": {
    "from": ["Type1"],
    "as_of": {
      "date": "1000"
    }
  }
}

The "as of" operator will not fail in case no history is available. Rather, it will return the record at its current state.

History browsing relies on the journal. If you inserted data straight into the SQL database or disabled event creation for tMDMOutput/tMDMDelete components, you should expect only partial history.

MDM can only build a history of the records based on information it stored about updates: if you disabled journal for some updates, it cannot be a complete and accurate history.

For example, to return all Type1 instances with the values they had one second after January 1st 1970 (1000 ms after EPOCH), use the query:

{
  "select": {
    "from": ["Type1"],
    "as_of": {
      "date": "1970-01-01T00:00:01"
    }
  }
}

History navigation shortcuts

The element "date" for the "as_of" element supports some handy shortcuts:

  • yesterday: date 24 hours ago.
  • creation: date of record creation. When you use "creation", you are not allowed to use the swing parameter. Otherwise, an expected error will be thrown.
  • now: current date.

For example, to return all instances of Type1 as they were yesterday (24 hours ago), use the query:

{
  "select": {
    "from": ["Type1"],
    "as_of": {
      "date": "yesterday"
    }
  }
}

Filter history (get by id)

You may want to filter the history records instead of getting all instances.

For example, to return the instance of Type1 with "id = 1" as it was on January 1st 1970, use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "eq": [
        {"field": "Type1/id"},
        {"value": "1"}
      ]
    },
    "as_of": {
      "date": "1000"
    }
  }
}

An improvement on this query would be to use all conditions with "as_of", which allows you to query on previous values. Note that conditions do not need to be only on id elements.

For example, to return the Type1 instance that contained "text" on January 1st, 1970, use the query:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "contains": [
        {"field": "Type1/value"},
        {"value": "text"}
      ]
    },
    "as_of": {
      "date": "1000"
    }
  }
}

However, this has scalability issues since it might force MDM to build a complete set of Type1 instances, thus leading to memory issues since states are computed in memory.

Using Joins together with as_of

You can use "joins" in a query together with the "as_of" operator.

For example, using the query below returns three fields ("id", "value1", "value2") coming from two types ("Type1", "Type2"):

{
  "select": {
    "from": [
      "Type1",
      "Type2"
    ],
    "fields": [
      {"field": "Type1/id"},
      {"field": "Type1/value1"},
      {"field": "Type2/value2"}
    ],
    "joins": [
      {
        "from": "Type1/fk2",
        "on": "Type2/id"
      }
    ],
    "as_of": {
      "date": "1000"
    }
  }
}

The returned values have the features below:

  • The "as_of" returns the values of "Type1" as they were on 01/01/1970.
  • The FK value used for Type2 is the FK value as of 01/01/1970.
  • The values for Type2 are those on 01/01/1970.

Security and user roles

Bear in mind that the query passed to the service is not always the query MDM will execute. Before executing the query, MDM will prune all elements the logged user is not allowed to access. This includes selected fields, conditions, and order by. All elements to hide from users are removed from the query without reporting any errors.

For example,

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/id"},
	  {"field": "Type1/secret_field"}
    ]
  }
}

In this example, MDM will automatically remove "secret_field" from the query if the current user (that is, the user used to log on the REST API) is not allowed to access "secret_field". Therefore, the actual query executed will be:

{
  "select": {
    "from": ["Type1"],
    "fields": [
      {"field": "Type1/id"}
    ]
  }
}

Another example of such queries is:

{
  "select": {
    "from": ["Type1"],
    "where": {
      "gt": [
        {"field": "Type1/secret_field"},
        {"value": "1"}
      ]
    }
  }
}

This query will be read as "return all instances of Type1 where Type1's secret_field is greater than 1", and the returned result is expected to be Type1 instances where secret_field > 1.

However, if the current user's roles do not have access right to "secret_field", the actual query becomes:

{
  "select": {
    "from": ["Type1"]
  }
}

This query will be read as "return all instances of Type1", and the returned result now is expected to be all Type1 instances since the current user is not allowed to access "secret_field".

This prevents unprivileged users to indirectly guess what are the values in "secret_field" using value conditions on this field.

An example of how to execute an HTTP GET operation with MDM REST API from Talend MDM Web UI

From Talend MDM Web UI, authorized users can use the Tools menu to access the MDM REST API documentation page based on Swagger, and test REST API operations on this page if needed.

The following introduces an example of executing an HTTP GET operation with MDM REST API to list all primary keys of the master data records in a container on the MDM server.

Before you begin

Make sure that you have already imported the MDM Demo project and created several master data records for the Product entity of the Product container.

Procedure

  1. Open the REST API documentation page from Talend MDM Web UI.
  2. From the list of operations under the User data management category, click the GET link which is followed by /data/{containerName}/{type} to displays its details. You can also click Lists all primary keys for container and type to display the operation details.
  3. Select Master from the Container list.
  4. Enter Product in the containerName field.
  5. Enter Product in the type field.
  6. Click Execute to execute the GET operation.

    The HTTP request URL, response body, response code and response headers are displayed accordingly. In this example, all primary keys of the Product container are listed.

  7. To verify the execution result of this GET operation, go to the Master Data Browser page, then search the master data records of the Product container, and validate that the execution result matches the search result.

Transactions API integration

Transactions API can be integrated with data API.

For more information about the MDM transactions, see MDM Transactions.

Attaching a REST request to an existing transaction

Modifications made through the REST data API can be attached to a transaction created through the transactions API.

To attach a REST request to an existing transaction, you just need to set the custom HTTP header "transaction-id" to the id of the transaction. If the provided transaction id does not exist, it will be created.

For example:
PUT /talendmdm/services/rest/transactions, creates a new transaction and returns its id
POST /talendmdm/services/rest/data/Product (with transaction-id set to the id returned at step1), creates a new product but does not commit changes to the database
POST /talendmdm/services/rest/transactions/<id> commits the transactions and thus changes made at step 2

This feature also works with batch insert operation (/talendmdm/services/rest/data/Product/batch).