Cloudera Impala Hadoop HiveQL DDL - Import - 7.1

Talend Data Catalog Bridges

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
EnrichPlatform
Talend Data Catalog

Bridge Specifications

Vendor Cloudera
Tool Name Impala Hadoop Hive Database
Tool Version 0.13
Tool Web Site http://www.cloudera.com/products/apache-hadoop/impala.html
Supported Methodology [Relational Database] Data Store (Physical Data Model), (Expression Parsing) via SQL TXT File
Data Profiling
Multi-Model Harvesting
Incremental Harvesting
Remote Repository Browsing for Model Selection

SPECIFICATIONS
Tool: Cloudera Impala Hadoop Hive Database version 0.13 via SQL TXT File
See http://www.cloudera.com/products/apache-hadoop/impala.html
Metadata: [Relational Database] Data Store (Physical Data Model), (Expression Parsing)
Bridge: DdlScriptApacheHiveQLImport.ClouderaImpala version 11.0.0

OVERVIEW
WARNING: This database DDL SQL script import bridge was designed only for the DDL statements creating tables, views, etc., and includes limitations. Instead, use the dedicated live database import via JDBC which will generate a complete and detailed data flow lineage integrating all transformations with stored procedures, views, etc. (which might have been created by many such DDL SQL scripts).
The purpose of this HiveQL DDL script import bridge is to detect and parse all its embedded SQL statements in order to generate the exact scope (data models) of the desired data base.


Bridge Parameters

Parameter Name Description Type Values Default Scope
File Select a file that contains DDL scripts to import FILE
*.sql
*.hql
*.ddl
  Mandatory
Default schema The default schema name will be applied only for the objects that don't have a schema qualifier defined. STRING      
Schemas Description: you can specify a list of database schemas to import.
When the list is empty all available schemas are imported.
The list can have one or more schema names separated by semicolons (e.g. schema1; schema2).
You can specify schema name patterns using Teradata SQL LIKE expressions syntax.
Patterns support inclusions and exclusions.
Here is an example of inclusion syntax, "A%; %B; %C%; D" that tries to get schema names that:
- start with A or
- end with B or
- contain C or
- equal D

Note: when a pattern has special characters, like spaces enclosed it in single quote marks (e.g. "'two wo%'; onewo%;").

To exclude a pattern prefix it with 'NOT'. Here is an example of exclusion syntax, "A%; %B; NOT %SYS; NOT 'SYS%'"
that contributes to the following SQL filter: "where (name like A% or name like %B) and (name not like %SYS) and (name not like 'SYS%')"
STRING      
Tables, Views Description: you can specify a list of database tables to import.
When the list is empty all available tables are imported.
The list can have one or more table names separated by semicolons (e.g. table1; table2).
You can specify table name patterns using SQL LIKE expressions syntax.
Patterns support inclusions and exclusions.
Here is an example of inclusion syntax, "A%; %B; %C%; D" that tries to get table names that: - start with A or
- end with B or
- contain C or
- equal D

Note: when a pattern has special characters, like spaces enclose it in single quote marks (e.g. "'two wo%'; onewo%;").

To exclude a pattern prefix it with NOT. Here is an example of exclusion syntax, "A%; %B; NOT %SYS; NOT 'SYS%'" that
contributes to the following SQL filter: "where (name like A% or name like %B) and (name not like %SYS) and (name not like 'SYS%')"
STRING      
Miscellaneous Specify miscellaneous options identified with a -option followed by a value if required:

GENERAL OPTIONS
-m <Java Memory's maximum size>
1G by default on 64bits JRE or as set in conf/conf.properties, e.g.
-m 8G
-m 2500M

-j <Java Runtime Environment command line options>
This option must be the last one in the Miscellaneous parameter as all the text after -j is passed "as is" to the JRE, e.g.
-j -Dname=value -Xms1G

-jre <Java Runtime Environment full path name>
It can be an absolute path to javaw.exe on Windows or a link/script path on Linux, e.g.
-jre "c:\Program Files\Java\jre1.8.0_211\bin\javaw.exe"

-v <Environment variable value>
None by default, e.g.
-v var1=value1 -v var2="value2 with spaces"

-model.name <model name>
Override the model name, e.g.
-model.name "My Model Name"

-prescript <script name>
The script must be located in the bin directory, and have .bat or .sh extension.
The script path must not include any parent directory symbol (..).
The script should return exit code 0 to indicate success, or another value to indicate failure.
For example:
-prescript \"script.bat\"

SQL DDL OPTIONS
-s <pathname>
Path to the key/value pair file that will be used to preprocess the script files and replace shell parameters with the actual values. The path can be escaped with double quotes if it contains spaces or any special characters. The colon character (:) must be escaped with '\' if it is part of a parameter name. The file (referred to by the pathname) must have the following format:
key1:value1
"key 2":value2
\:key3withcolon:value3
E.g.
-s J:\MIMB\map_of_shell_parameters.txt

-e <encoding>
This value will be used to load text from the specified script files. By default, UTF-8 will be used. E.g.
-e UTF-16
-e UTF-16BE
-e US-ASCII
-e ISO-8859-1

-p <warehouse path>
Warehouse path. /user/hive/warehouse by default. E.g.
-p /user/hive/temp
STRING      

 

Bridge Mapping

Mapping information is not available