Configuring the process of writing data into the HBase - 7.1

HBase

EnrichVersion
7.1
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
EnrichPlatform
Talend Studio
task
Data Governance > Third-party systems > NoSQL components > HBase components
Data Quality and Preparation > Third-party systems > NoSQL components > HBase components
Design and Development > Third-party systems > NoSQL components > HBase components

About this task

To do this, proceed as follows:

Procedure

  1. On the Design workspace, double-click the tFixedFlowInput component to open its Component view.
  2. In this view, click the three-dot button next to Edit schema to open the schema editor.
  3. Click the plus button three times to add three rows and in the Column column, rename the three rows respectively as: id, name and age.
  4. In the Type column, click each of these rows and from the drop-down list, select the data type of every row. In this scenario, they are Integer for id and age, String for name.
  5. Click OK to validate these changes and accept the propagation prompted by the pop-up dialog box.
  6. In the Mode area, select the Use Inline Content (delimited file) to display the fields for editing.
  7. In the Content field, type in the delimited data to be written into the HBase, separated with the semicolon ";". In this example, they are:
    
                      1;Albert;23
    2;Alexandre;24
    3;Alfred-Hubert;22
    4;Andre;40
    5;Didier;28
    6;Anthony;35
    7;Artus;32
    8;Catherine;34
    9;Charles;21
    10;Christophe;36
    11;Christian;67
    12;Danniel;54
    13;Elisabeth;58
    14;Emile;32
    15;Gregory;30 
                   
  8. Double-click tHBaseOutput to open its Component view.
    Note: If this component does not have the same schema of the preceding component, a warning icon appears. In this case, click the Sync columns button to retrieve the schema from the preceding one and once done, the warning icon disappears.
  9. Select the Use an existing connection check box and then select the connection you have configured earlier. In this example, it is tHBaseConnection_1.
  10. In the Table name field, type in the name of the table to be created in the HBase. In this example, it is customer.
  11. In the Action on table field, select the action of interest from the drop-down list. In this scenario, select Drop table if exists and create. This way, if a table named customer exists already in the HBase, it will be disabled and deleted before creating this current table.
  12. Click the Advanced settings tab to open the corresponding view.
  13. In the Family parameters table, add two rows by clicking the plus button, rename them as family1 and family2 respectively and then leave the other columns empty. These two column families will be created in the HBase using the default family performance options.
    Note: The Family parameters table is available only when the action you have selected in the Action on table field is to create a table in HBase. For further information about this Family parameters table, see tHBaseOutput.

  14. In the Families table of the Basic settings view, enter the family names in the Family name column, each corresponding to the column this family contains. In this example, the id and the age columns belong to family1 and the name column to family2.
    Note: These column families should already exist in the HBase to be connected to; if not, you need to define them in the Family parameters table of the Advanced settings view for creating them at runtime.