Set up an HCatalog database - 7.0

Big Data Job Examples

author
Talend Documentation Team
EnrichVersion
7.0
EnrichProdName
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Open Studio for Big Data
Talend Real-Time Big Data Platform
task
Design and Development > Designing Jobs
Design and Development > Designing Jobs > Hadoop distributions
Design and Development > Designing Jobs > Job Frameworks > Standard
EnrichPlatform
Talend Studio

Procedure

  1. Double-click the tHDFSDelete component, which is labelled HDFS_ClearResults in this example, to open its Basic settings view on the Component tab.
  2. Click the Property Type list box and select Repository, and then click the [...] button to open the [Repository Content] dialog box to use a centralized HDFS connection.
  3. Select the HDFS connection defined for connecting to the HDFS system and click OK.

    All the connection details are automatically filled in the respective fields.

  4. In the File or Directory Path field, specify the directory where the access log file will be stored on the HDFS, /user/hdp/weblog in this example.
  5. Double-click the first tHCatalogOperation component, which is labelled HCatalog_Create_DB in this example, to open its Basic settings view on the Component tab.
  6. Click the Property Type list box and select Repository, and then click the [...] button to open the [Repository Content] dialog box to use a centralized HCatalog connection.
  7. Select the HCatalog connection defined for connecting to the HCatalog database and click OK. All the connection details are automatically filled in the respective fields.
  8. From the Operation on list, select Database; from the Operation list, select Drop if exist and create.
  9. In the Option list of the Drop configuration area, select Cascade.
  10. In the Database location field, enter the location for the database file is to be created in HDFS, /user/hdp/weblog/weblogdb in this example.