Using the JOIN clause - Cloud

Talend Data Shaping Language Reference Guide

Version
Cloud
Language
English (United States)
Product
Talend Cloud
Module
Data Shaping Language
Content
Design and Development

The JOIN clause is used to join two collections based on a condition.

The JOIN clause should be used after a FROM or an UNNEST clause. The JOIN keyword is followed by the identifier and the ON keyword that introduces the expression to use as the condition:
JOIN identifier ON expression 
The condition should be a simple or conditional expression that returns a Boolean. For example:
FROM customers AS c
JOIN orders AS o ON c.custid == o.custid

The JOIN clause can contain the AS and INDEX keywords described in Using the FROM clause.

By default, the JOIN keyword performs an inner join, but you can specify the type of join to use:
  • The INNER JOIN clause only joins items that match the condition in both collections.
  • The LEFT OUTER JOIN returns all items from the collection in the FROM or UNNEST clause and joins the items from the collection in the JOIN clause that match the condition.
For example, with the following input:
{
    "customers": [
        {
            "custid": "1234",
            "name": "Jack Smith"
        },
        {
            "custid": "5678",
            "name": "Jane Smith"
        },
        {
            "custid": "9874",
            "name": "John Doe"
        }
    ],
    "orders": [
        {
            "orderid": "abc-6511",
            "custid":"1234"
        },
        {
            "orderid": "def-6872",
            "custid":"5678"
        }
    ]
}
The same query with a different join type returns different results:
  • The INNER JOIN returns only the customers associated with an order:
    FROM customers AS c
    INNER JOIN orders AS o ON c.custid == o.custid
    SELECT {
      name,
      orderid
    }
    [
       {
          "name":"Jack Smith",
          "orderid":"abc-6511"
       },
       {
          "name":"Jane Smith",
          "orderid":"def-6872"
       }
    ]
  • The LEFT OUTER JOIN returns all customers and ignores the orderid element when no order is associated with the customer:
    FROM customers AS c
    LEFT OUTER JOIN orders AS o ON c.custid == o.custid
    SELECT {
      name,
      orderid
    }
    [
       {
          "name":"Jack Smith",
          "orderid":"abc-6511"
       },
       {
          "name":"Jane Smith",
          "orderid":"def-6872"
       },
       {
          "name":"John Doe"
       }
    ]
Note: In this example, the elements used in the condition have the same name in both collections. To avoid issues with relative paths, an alias needs to be defined for each collection.