Scenario: Retrieving data from a collection by advanced queries - 6.1

Talend Open Studio for Big Data Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Open Studio for Big Data
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

In this scenario, advanced MongoDB queries are used to retrieve the post by the author Anderson.

There are such posts in the collection blog of the MongoDB database talend:

To insert data into the database, see Scenario 1: Creating a collection and writing data to it.

Linking the components

  1. Drop tMongoDBConnection, tMongoDBClose, tMongoDBInput and tLogRow onto the workspace.

  2. Link tMongoDBConnection to tMongoDBInput using the OnSubjobOk trigger.

  3. Link tMongoDBInput to tMongoDBClose using the OnSubjobOk trigger.

  4. Link tMongoDBInput to tLogRow using a Row > Main connection.

Configuring the components

  1. Double-click tMongoDBConnection to open its Basic settings view.

  2. From the DB Version list, select the MongoDB version you are using.

  3. In the Server and Port fields, enter the connection details.

  4. In the Database field, enter the name of the MongoDB database.

  5. Double-click tMongoDBInput to open its Basic settings view.

  6. Select the Use existing connection option.

  7. In the Collection field, enter the name of the collection, namely blog.

  8. Click the [...] button next to Edit schema to open the schema editor.

  9. Click the [+] button to add five columns, namely id, author, title, keywords and contents, with the type as Integer and String respectively.

  10. Click OK to close the editor.

  11. The columns now appear in the left part of the Mapping area.

  12. For columns author, title, keywords and contents, enter their parent node post so that the data can be retrieved from the correct positions.

  13. In the Query box, enter the advanced query statement to retrieve the posts whose author is Anderson:

    "{post.author : 'Anderson'}"

    This statement requires that the sub-node of post, the node author, should have the value "Anderson".

  14. Double-click tLogRow to open its Basic settings view.

    Select Table (print values in cells of a table) for a better display of the results.

Executing the Job

  1. Press Ctrl+S to save the Job.

  2. Press F6 to run the Job.

    As shown above, the post by Anderson is retrieved.