In this step, you will retrieve the table schema of interest from the connected Hive database.
In the Repository view, right-click the Hive
connection of interest and select Retrieve
schema from the contextual menu, and click Next on the wizard that opens to view and filter different
tables in that Hive database.
Expand the nodes of the database tables you need to use and select the columns
to be retrieved, and click Next to open a new
view on the wizard that lists the selected table schema(s). You can select any
of them to display its details in the Schema
area on the right side of the wizard.
Warning: If your source database table contains any default value that is a function or an expression rather than a string, be sure to remove the single quotation marks, if any, enclosing the default value in the end schema to avoid unexpected results when creating database tables using this schema. For more information, see Verifying default values in a retrieved schema.
Modify the selected schema if needed. You can rename the schema, and customize
the schema structure according to your needs in the Schema area.
The tool bar allows you to add, remove or move columns in your schema.To overwrite the modifications you made on the selected schema using its default schema, click Retrieve schema. Note that all your changes to the schema will be lost if you click this button.
Click Finish to complete the Hive table
schema retrieval. All the retrieved schemas are displayed under the related Hive
connection in the Repository view.
If you need to further edit a schema, right-click the schema and select Edit Schema from the contextual menu to open this wizard again and make your modifications.Warning:
If you modify the schemas, ensure that the data type in the Type column is correctly defined.
if you select from the Hadoop cluster list the Repository option to reuse details of an established Hadoop connection, the created Hive connection will eventually be classified under both the Hadoop cluster node and the Db connection node;
otherwise, if you select from the Hadoop cluster list the None option in order to enter the Hadoop connection properties yourself, the created Hive connection will appear under the Db connection node only.