Component-specific settings for tHBaseOutput - Cloud - 7.3

Talend Job Script Reference Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend CommandLine
Talend Studio
Content
Design and Development > Designing Jobs
Last publication date
2023-09-13

The following table describes the Job script functions and parameters that you can define in the setSettings {} function of the component.

Function/parameter Description Mandatory?

USE_EXISTING_CONNECTION

Set this parameter to true and specify the name of the relevant connection component using the CONNECTION parameter to reuse the connection details you already defined.

No

DISTRIBUTION

Specify a cluster distribution. Acceptable values:

  • APACHE
  • CLOUDERA
  • HORTONWORKDS
  • MAPR
  • PIVOTAL_HD
  • CUSTOM

If you do not provide this parameter, the default cluster distribution is Amazon EMR.

No

HBASE_VERSION

Specify the version of the Hadoop distribution you are using. Acceptable values include:

  • For Amazon EMR:

    • EMR_5_5_0
    • EMR_5_0_0
    • EMR_4_6_0
  • For Apache:

    • APACHE_1_0_0
  • For Cloudera:

    • Cloudera_CDH5_10
    • Cloudera_CDH5_8
    • Cloudera_CDH5_7
    • Cloudera_CDH5_6
    • Cloudera_CDH5_5
  • For HortonWorks:

    • HDP_2_6
    • HDP_2_5
    • HDP_2_4
  • For MapR:

    • MAPR520
    • MAPR510
    • MAPR500
  • For Pivotal HD:

    • PIVOTAL_HD_2_0
    • PIVOTAL_HD_1_0_1

The default value is EMR_5_5_0.

No

HADOOP_CUSTOM_VERSION

If you are using a custom cluster, use this parameter to specify the Hadoop version of that custom cluster, which is either HADOOP_1 (default) or HADOOP_2.

No

ZOOKEEPER_QUORUM

Type in the name or the URL of the Zookeeper service you use to coordinate the transaction between your Studio and your database.

Note that when you configure the Zookeeper, you may need to explicitly define the path to the root znode that contains all the znodes created and used by your database by using the SET_ZNODE_PARENT and ZNODE_PARENT parameters.

Yes

ZOOKEEPER_CLIENT_PORT

Type in the number of the client listening port of the Zookeeper service you are using.

Yes

SET_ZNODE_PARENT

When needed, set this parameter to true and specify the path to the root znode using the ZNODE_PARENT parameter.

No

USE_KRB

If the database to be used is running with Kerberos security, set this parameter to true and then specify the principal names using the HBASE_MASTER_PRINCIPAL and HBASE_REGIONSERVER_PRINCIPA parameters.

No

USE_KEYTAB

If you need to use a Kerberos keytab file to log in, set this parameter to true and specify the principal using the PRINCIPAL parameter and the access path to the keytab file using the KEYTAB_PATH parameter.

No

USE_MAPRTICKET

If this cluster is a MapR cluster of the version 4.0.1 or later, you may need to set the MapR ticket authentication configuration by setting this parameter to true and providing relevant information using the MAPRTICKET_CLUSTER, MAPRTICKET_DURATION, USERNAME, and MAPRTICKET_PASSWORD parameters. For more information, see the section about connecting to a security-enabled MapR cluster in MapR.

No

TABLE

Type in the name of the HBase table you need to write data into.

Yes

SET_TABLE_NS_MAPPING

If needed, set this parameter to true and use the TABLE_NS_MAPPING to provide the string to be used to construct the mapping between an Apache HBase table and a MapR table.

No

TABLE_ACTION

Type in the action you need to take on the specified table. Accepted values:

  • NONE (default)
  • CREATE
  • DROP_CREATE
  • CREATE_IF_NOT_EXISTS
  • DROP_IF_EXISTS_AND_CREATE

No

FAMILIES {}

Include in this function the following parameters to map the columns of the table to be used with the schema columns you have defined for the data flow to be processed.

  • SCHEMA_COLUMN: Type in the name of the schema column to be mapped.
  • FAMILY_COLUMN: Type in the column family you want to map the schema column with.

For further information about a column family, see Apache documentation at Column families.

Yes

DIE_ON_ERROR

Set this parameter to true to stop the execution of the Job when an error occurs.

By default, this parameter is set to false.

No

USE_BATCH_MODE

Set this parameter to true to activate batch mode for data processing, and then use the BATCH_SIZE to specify the number of records to be processed in each batch.

No

HBASE_PARAMETERS {}

If you need to use custom configuration for your database, include in this function one or more sets of the following parameters to specify the property or properties to be customized. Then at runtime, the customized property or properties will override the corresponding ones used by the Studio.

  • PROPERTY: Type in the name of the property.
  • VALUE: Type in the new value of the property.

No

FAMILY_PARAMETERS {}

Type in the names and, if needed, the custom performance options of the column family or families to be created using one or more sets of the following parameters. These options are all attributes defined by the HBase data model, so for further explanation about these options, see Apache's HBase documentation.

  • FAMILY_NAME
  • FAMILY_INMEMORY
  • FAMILY_BLOCKCACHEENABLED
  • FAMILY_BLOOMFILTERTYPE
  • FAMILY_BLOCKSIZE
  • FAMILY_COMPACTIONCOMPRESSIONTYPE
  • FAMILY_COMPRESSIONTYPE
  • FAMILY_MAXVERSIONS
  • FAMILY_SCOPE
  • FAMILY_TIMETOLIVE

Yes

SET_MAPR_HOME_DIR

If the location of the MapR configuration files has been changed to somewhere else in the cluster, that is to say, the MapR Home directory has been changed, set this parameter to true and use the MAPR_HOME_DIR parameter to provide the new home directory.

No

SET_HADOOP_LOGIN

If the login module to be used in the mapr.login.conf file has been changed, set this parameter to true and use the HADOOP_LOGIN parameter to provide the module to be called from the mapr.login.conf file

No

TSTATCATCHER_STATS

Set this parameter to true to gather the processing metadata at the Job level as well as at each component level.

By default, this parameter is set to false.

No

LABEL

Use this parameter to specify a text label for the component.

No