tCouchbaseOutput - 6.1

Talend Components Reference Guide

EnrichVersion
6.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Open Studio for Big Data
Talend Open Studio for Data Integration
Talend Open Studio for Data Quality
Talend Open Studio for ESB
Talend Open Studio for MDM
Talend Real-Time Big Data Platform
task
Data Governance
Data Quality and Preparation
Design and Development
EnrichPlatform
Talend Studio

Warning

This component will be available in the Palette of the studio on the condition that you have subscribed to one of the Talend solutions with Big Data.

tCouchbaseOutput Properties

Component family

Big Data / Couchbase

 

Function

tCouchbaseOutput inserts, updates, upserts or deletes the documents in the Couchbase database which are stored in the form of Key/Value pairs, where the Value can be JSON or binary data.

Purpose

This component allows you to perform actions on the JSON or binary documents stored in the Couchbase database based on the incoming flat data from a file, a database table etc.

Basic settings

Schema and Edit Schema

A schema is a row description. It defines the number of fields (columns) to be processed and passed on to the next component. The schema is either Built-In or stored remotely in the Repository.

Since version 5.6, both the Built-In mode and the Repository mode are available in any of the Talend solutions.

Click Edit schema to make changes to the schema. If the current schema is of the Repository type, three options are available:

  • View schema: choose this option to view the schema only.

  • Change to built-in property: choose this option to change the schema to Built-in for local changes.

  • Update repository connection: choose this option to change the schema stored in the repository and decide whether to propagate the changes to all the Jobs upon completion. If you just want to propagate the changes to the current Job, you can select No upon completion and choose this schema metadata again in the [Repository Content] window.

Click Sync columns to retrieve the schema from the previous component connected in the Job.

 

 

Built-In: You create and store the schema locally for this component only. Related topic: see Talend Studio User Guide.

 

 

Repository: You have already created the schema and stored it in the Repository. You can reuse it in various projects and Job designs. Related topic: see Talend Studio User Guide.

When the schema to be reused has default values that are integers or functions, ensure that these default values are not enclosed within quotation marks. If they are, you must remove the quotation marks manually.

For more details, see https://help.talend.com/display/KB/Verifying+default+values+in+a+retrieved+schema.

 

Use existing connection

Select this check box and in the Component List click the relevant connection component to reuse the connection details you already defined.

 

DB Version

List of database versions.

 

Data Bucket

Name of the data bucket in the Couchbase database.

 

Username and Password

Authentication credentials for a data bucket, instead of those for a server node.

To enter the password, click the [...] button next to the password field, and then in the pop-up dialog box enter the password between double quotes and click OK to save the settings.

 

URIs

URIs of server nodes in the Couchbase cluster, in the form of "http://127.0.0.1:8091/pools" or "http://localhost:8091/pools".

 

Key

Schema field whose contents will be used as the ID of a document in the Couchbase database.

 

Value

Schema field whose contents will be saved in the Couchbase database as binary documents.

Available when Include JSON Document is not selected.

 

Action on data

The following operations are available:

Insert: insert data.

Update: update data.

Insert or Update: insert or update data.

Delete: delete data.

Include JSON Document

Select this check box for JSON configuration:

Configure JSON Tree: click the [...] button to open the interface for JSON tree configuration. For more information, see Configuring a JSON Tree.

Group by: click the [+] button to add lines and choose the input columns for grouping the records.

 

Die on error

This check box is cleared by default, meaning to skip the row on error and to complete the process for error-free rows.

Advanced settings

Expire

Expiration value for a document. Defaulted to 0, it means the document will be stored indefinitely.

The expiration time can either be a relative time (for example 60 seconds), or absolute time (31st December 2020, 12:00pm).

 

tStatCatcher Statistics

Select this check box to collect the log data at the component level.

Global Variables

NB_LINE: the number of rows read by an input component or transferred to an output component. This is an After variable and it returns an integer.

NB_LINE_INSERTED: the number of rows inserted. This is an After variable and it returns an integer.

NB_LINE_REJECTED: the number of rows rejected. This is an After variable and it returns an integer.

ERROR_MESSAGE: the error message generated by the component when an error occurs. This is an After variable and it returns a string. This variable functions only if the Die on error check box is cleared, if the component has this check box.

A Flow variable functions during the execution of a component while an After variable functions after the execution of the component.

To fill up a field or expression with a variable, press Ctrl + Space to access the variable list and choose the variable to use from it.

For further information about variables, see Talend Studio User Guide.

Usage

Preceded by an input component, tCouchbaseOutput wraps flat data into JSON documents for storage in the Couchbase database.

Log4j

If you are using a subscription-based version of the Studio, the activity of this component can be logged using the log4j feature. For more information on this feature, see Talend Studio User Guide.

For more information on the log4j logging levels, see the Apache documentation at http://logging.apache.org/log4j/1.2/apidocs/org/apache/log4j/Level.html.

Limitation

n/a

Scenario: Inserting documents to a data bucket in the Couchbase database

This scenario inserts three blog posts to a data bucket in the Couchbase database. The source records are of flat data type and thus wrapped into JSON data before storage in the Couchbase database as documents. Note that the values of the source data field title, which is selected as the Key in the Basic settings of tCouchbaseOutput, are used as the document IDs in the Couchbase database.

Linking the components

  1. Drop tCouchbaseConnection, tFixedFlowInput, tCouchbaseOutput, and tCouchbaseClose onto the workspace.

  2. Link tCouchbaseConnection to tFixedFlowInput using the OnSubjobOk trigger.

  3. Link tFixedFlowInput to tCouchbaseOutput using a Row > Main connection.

  4. Link tFixedFlowInput to tCouchbaseClose using the OnSubjobOk trigger.

Configuring the components

  1. Double-click tCouchbaseConnection to open its Basic settings view.

  2. In the Data Bucket field, enter the name of the data bucket in the Couchbase database.

    In the Password field, enter the password for access to the data bucket.

    In the URIs table, click the [+] button to add lines as needed, where you can enter the URIs of the Couchbase server nodes.

  3. Double-click tFixedFlowInput to open its Basic settings view.

    Select Use Inline Content (delimited file) in the Mode area.

    In the Content field, enter the data to write to the Couchbase database, for example:

    1;Andy;Integration at any scale;Talend, the leader of the DI space...
    2;Andy;Data Integration Overview;Talend, the leading player in the DI field...
    3;Andy;ELT Overview;Talend, the big name in the ELT circle...
  4. Click the Edit schema button to open the schema editor.

  5. Click the [+] button to add four columns, namely id, author, title and contents, of the string type.

    Click OK to validate the setup and close the editor.

  6. Click tCouchbaseOutput to open its Basic settings view.

  7. Select the Use existing connection check box to reuse the connection.

  8. In the Key list, select the field title whose values will be used as the IDs of documents inserted to the Couchbase database.

  9. Select the Generate JSON Document check box and click the Configure JSON Tree button to open the JSON tree mapper.

  10. Press the Shift key to select all the fields in the Linker source area and drop them onto the rootTag node in the Link target part.

  11. In the pop-up box, select Create as sub-element of target node.

    Click OK to validate the setup and close the box.

  12. Right-click the id node in the Link target part and select Set as Loop Element from the contextual menu.

    Click OK to validate the setup and close the mapper.

Executing the Job

  1. Press F6 to save and run the Job.

  2. Go to the Couchbase web console and view the documents stored in the data bucket blog:

    As shown above, the source records have been saved in the Couchbase database in the form of JSON documents.