Skip to main content Skip to complementary content

Converting a Hive table into an Iceberg table

The first step of this scenario is to convert the data from the customers_hive Hive table into an Iceberg table called customers_iceberg. The same actions are then repeated for the "marketing" tables with marketing_hive and marketing_iceberg.

About this task

For this task, the Converting subJob is used.

Procedure

  1. Optional: From the Basic settings view of tIcebergTable called Drop table, configure the parameters as follows:
    tIcebergTable component parameters.
    1. From the Connection drop-down list, select the connection component to be used. In this example it is tIcebergConnection_1.
    2. In the Table name field enter the name of the table to be removed. In this example it is customers_iceberg.
    3. From the Action on table drop-down list, select Drop if it exists to remove the table only if a table with the same name already exists.
      Information noteNote: This step is necessary only if a table with the same name already exists and you want to remove it in order to create a new one.
  2. From the Basic settings view of tIcebergTable called Create customers_iceberg, configure the parameters as follows to create the Iceberg table from an existing Hive table:
    tIcebergTable component parameters.
    1. From the Connection drop-down list, select the connection component to be used. In this example it is tIcebergConnection_1.
    2. In the Table name field enter the name of the table to be created. In this example it is "customers_iceberg".
    3. From the Action on table drop-down list, select Create if it does not exist to create the Iceberg table.
    4. Select the Create as select checkbox, and then in the As select query field, enter the SELECT query to be performed. In this example it is "SELECT * FROM customers-hive" which enables you to select all the data from the customers-hive table.
    5. Select the format of your data from the drop-down list. In this example it is AVRO.
    6. Leave the other parameters as is.
  3. Optional: From the Basic settings of tIcebergRow called v2 format, configure the parameters as follows to update the version of the Iceberg table to be used:
    tIcebergRow component parameters.
    1. From the Connection drop-down list, select the connection component to be used. In this example it is tIcebergConnection_1.
    2. In the Sql query field, enter the SQL query to perform. In this example, it is "ALTER TABLE marketing_iceberg SET TBLPROPERTIES ('format-version' = '2')" which enables you to modify the table by adding a new property to configure Iceberg version.
  4. Execute the subJob by clicking the Run button from the Run tab.

Results

The customers_iceberg Iceberg table is created.
You can double-check the Iceberg table creation from your database. In this example, Hue is used:
Iceberg table properties on Hue.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!