Upserting data to MongoDB - 7.3

MongoDB

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > NoSQL components > MongoDB components
Data Quality and Preparation > Third-party systems > NoSQL components > MongoDB components
Design and Development > Third-party systems > NoSQL components > MongoDB components
Last publication date
2024-02-21

Procedure

  1. Double-click tMongoDBConnection to open its Basic settings view.
  2. From the DB Version list, select the MongoDB version you are using.
  3. In the Server and Port fields, enter the connection details.
    In the Database field, enter the name of the MongoDB database.
  4. Double-click tFixedFlowInput to open its Basic settings view.
    Select Use Inline Content (delimited file) in the Mode area.
    In the Content field, enter the data for upserting the MongoDB database, for example:
    1;Andy;Open Source Outlook;Open Source,Talend;Talend, the leader of the open source world...
    2;Andy;Data Integration Overview;Data Integration,Talend;Talend, the leading player in the DI field...
    3;Anderson;ELT Overview;ELT,Talend;Talend, the big name in the ELT circle...
    4;Andy;Big Data Bang;Big Data,Talend;Talend, the driving force for Big Data applications... 
    As shown above, the 3rd record has its author changed and the 4th record is new.
  5. Double-click tMongoDBOutput to open its Basic settings view.
    Select the Use existing connection and Die on error check boxes.
    In the Collection field, enter the name of the collection, namely blog.
    Select Upsert from the Action on data list.
  6. Click the [...] button next to Edit schema to open the schema editor.
  7. Click the [+] button to add five columns in the right part, namely id, author, title, keywords and contents, with the type as Integer and String respectively.
    Click to copy all the columns to the input table.
    Click Ok to close the editor.
  8. In the Advanced Settings view, select the Generate JSON Document check box.
    Select the Remove root node check box.
    In the Data node and Query node fields, enter "data" and "query".
  9. Click the [...] button next to Configure JSON Tree to open the configuration interface.
  10. Right-click the node rootTag and select Add Sub-element from the contextual menu.
    In the dialog box that appears, type in data for the Data node:
    Click OK to close the window.
    Repeat this operation to define query as the Query node.
    Right-click the node data and select Set As Loop Element from the contextual menu.
    Warning:

    These nodes are mandatory for update and upsert actions. They are intended to enable the update and upsert actions though will not be stored in the database.

  11. Select all the columns under the Schema list and drop them to the data node.
    In the window that appears, select Create as sub-element of target node.
    Click OK to close the window.
    Repeat this operation to drop the id column from the Schema list under the Query node.
  12. Right-click the node id under data and select Add Attribute from the contextual menu.
    In the dialog box that appears, type in type as the attribute name:
    Click OK to close the window.
    Right-click the node @type under id and select Set A Fix Value from the contextual menu.
    In the dialog box that appears, type in integer as the attribute value, ensuring the id values are stored as integers in the database.
    Click OK to close the window.
    Repeat this operation to set this attribute for the id node under Query.
    Click OK to close the JSON Tree configuration interface.
  13. Double-click tMongoDBInput to open its Basic settings view.
    Select the Use existing connection check box.
    In the Collection field, enter the name of the collection, namely blog.
    Click the [...] button next to Edit schema to open the schema editor.
    Click the [+] button to add five columns, namely id, author, title, keywords and contents, with the type as Integer and String respectively.
    Click OK to close the editor.
    The columns now appear in the left part of the Mapping area.
    For columns author, title, keywords and contents, enter their parent node post so that the data can be retrieved from the correct positions.
  14. Double-click tLogRow to open its Basic settings view.
    In the Mode area, select Table (print values in cells of a table for better display.