Using the SELECT clause - Cloud - 8.0

Talend Data Shaping Language Reference Guide

Version
Cloud
8.0
Language
English
Product
Talend Cloud
Module
Data Shaping Language
Content
Design and Development
Last publication date
2023-11-23

The SELECT clause is used to define the elements to return as the result of a query.

The SELECT keyword can be followed by:
  • A simple or conditional expression to return a single value, a record or an array:
    SELECT expression          
  • A block to return key-value pairs. You can use a simple identifier, or an assignment expression to create a new key:
    SELECT {
    	identifier,
    	identifier = expression
    }
    The expression used as a value can be a simple or conditional expression.
  • A sub-query in parentheses to return a result nested in the result of the parent query:
    SELECT (
       FROM identifier1
       SELECT {
          identifier2,
          identifier3
       }
    )
  • Identifiers followed by sub-queries in parentheses, to return the result of several queries in one record:
    SELECT {
    	identifier1 = (query1),
    	identifier2 = (query2)
    }
Note: When using only an identifier, it references an element in the input. The name of the output field will be the same as the name of input field.
With a sub-query that produce an array, you can specify an index to return a specific item from the array. The index should be in brackets after the closing parenthesis of the sub-query:
SELECT ( FROM identifier1 SELECT { identifier2 } )[index]
For example, with the following input data:
{
	"customers": [
		{
			"firstName": "John",
			"lastName": "Smith",
			"address": {
				"street": "690 River St.",
				"city": "Hanover, MA",
				"zipcode": "02340"
			},
			"orders": [
				{
					"order_id": "abc-12345",
					"items": [
						{
							"item_id": "97516848-jiargn",
							"quantity": 2
						}
					]
				}
			]
		},
		{
			"firstName": "Jane",
			"lastName": "Doe",
			"address": {
				"street": "420 Green St.",
				"city": "Boston, MA",
				"zipcode": "02115"
			},
			"orders": [
				{
					"order_id": "def-12345",
					"items": [
						{
							"item_id": "97516848-kftesn",
							"quantity": 3
						},
						{
							"item_id": "96946848-metasb",
							"quantity": 1
						}
					]
				}
			]
		}
	]
}
You can use a SELECT clause with two nested queries to return a record containing an array of customer names and the first item in the array of item IDs:
SELECT {
 customers = (
  FROM customers AS c 
  SELECT { 
   name = concatWith(" ", c.firstName, c.lastName),
   c.address
  }
 ),
 items = (
  FROM customers AS c UNNEST c.orders AS o UNNEST o.items AS i 
  SELECT {i.item_id}
 )[0]
}
This query returns the following result:
{
	"customers": [
		{
			"name": "John Smith",
			"address": {
				"street": "690 River St.",
				"city": "Hanover, MA",
				"zipcode": "02340"
			}
		},
		{
			"name": "Jane Doe",
			"address": {
				"street": "420 Green St.",
				"city": "Boston, MA",
				"zipcode": "02115"
			}
		}
	],
	"items": {
		"item_id": "97516848-jiargn"
	}
}