Retrieving table schemas - Cloud - 7.3

Talend Studio User Guide

Version
Cloud
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-13
Warning: If you are working on an SVN or Git managed project while the Manual lock option is selected in Talend Administration Center, be sure to lock manually your connection in the Repository tree view before retrieving or updating table schemas for it. Otherwise the connection is read-only and the Finish button of the wizard is not operable. For more information about locking and unlocking a project item and on different lock types, see Working collaboratively on project items.

Procedure

  1. To retrieve table schemas from the database connection you have just set up, right-click the connection item from the Repository tree view and select Retrieve schema from the contextual menu.
    A new wizard opens up where you can filter and show different objects (tables, views and synonyms) in your database connection, select tables of interest, and define table schemas.
  2. Define a filter to filter databases objects according to your need. For details, see Filtering database objects.
    Click Next to open a view that lists your filtered database objects. The list offers all the databases with all their tables present on the database connection that meet you filter conditions.
    If no database is visible on the list, click Check connection to verify the database connection.
  3. Select one or more tables on the list to load them onto your repository file system. Your repository schemas will be based on these tables.
  4. Click Next. On the next window, four setting panels help you define the schemas to create. Modify the schemas if needed.
    Warning: Avoid using any Java reserved keyword as a schema column name.
    Make sure the data type in the Type column is correctly defined.
    For more information regarding Java data types, including date pattern, see Java API Specification.
    Below are the commonly used Talend data types:
    • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

    • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

    • Dynamic: a data type that can be set for a single column at the end of a schema to allow processing fields as VARCHAR(100) columns named either as ‘Column<X>’ or, if the input includes a header, from the column names appearing in the header. For more information, see Dynamic schema.

    • Document: a data type that allows processing an entire XML document without regarding to its content.

    Warning: If your source database table contains any default value that is a function or an expression rather than a string, be sure to remove the single quotation marks, if any, enclosing the default value in the end schema to avoid unexpected results when creating database tables using this schema.
    Tip: If you find a certain data type of the database not yet supported by Talend , you can edit the mapping file for that database to enable conversion between the database data type and the corresponding Talend data type. For more information, see Type mapping.
    By default, the schema displayed on the Schema panel is based on the first table selected in the list of schemas loaded (left panel). You can change the name of the schema and according to your needs, you can also customize the schema structure in the schema panel.
    The tool bar allows you to add, remove or move columns in your schema. In addition, you can load an XML schema from a file or export the current schema as XML.
    To retrieve a schema based on one of the loaded table schemas, select the database table schema name in the drop-down list and click Retrieve schema. Note that the retrieved schema then overwrites any current schema and does not retain any custom edits.
    When done, click Finish to complete the database schema creation. All the retrieved schemas are displayed in the Table schemas sub-folder under the relevant database connection node.
    Now you can drag and drop any table schema of the database connection from the Repository tree view onto the design workspace as a new database component or onto an existing component to reuse the metadata. For more information, see Using centralized metadata in a Job and Setting a repository schema in a Job.