Selecting tables and defining table schemas - 7.1

Talend Big Data Studio User Guide

author
Talend Documentation Team
EnrichVersion
7.1
EnrichProdName
Talend Big Data
task
Design and Development
EnrichPlatform
Talend Studio

About this task

Once you have the filtered list of the database objects, do the following to load the schemas of the desired objects onto your Repository:

Procedure

  1. Select one or more database objects on the list and click Next to open a new view on the wizard where you can see the schemas of the selected object.
    Note:

    If no schema is visible on the list, click the Check connection button below the list to verify the database connection status.

  2. Modify the schemas if needed.
    Make sure the data type in the Type column is correctly defined.
    For more information regarding Java data types, including date pattern, see Java API Specification.
    Below are the commonly used Talend data types:
    • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

    • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

    • Dynamic: a data type that can be set for a single column at the end of a schema to allow processing fields as VARCHAR(100) columns named either as ‘Column<X>’ or, if the input includes a header, from the column names appearing in the header. For more information, see Dynamic schema.

    • Document: a data type that allows processing an entire XML document without regarding to its content.

    Warning: If your source database table contains any default value that is a function or an expression rather than a string, be sure to remove the single quotation marks, if any, enclosing the default value in the end schema to avoid unexpected results when creating database tables using this schema. For more information, see Verifying default values in a retrieved schema.
    Tip: If you find a certain data type of the database not yet supported by Talend, you can edit the mapping file for that database to enable conversion between the database data type and the corresponding Talend data type. For more information, see Type mapping.
    By default, the schema displayed on the Schema panel is based on the first table selected in the list of schemas loaded (left panel). You can change the name of the schema and according to your needs. You can also customize the schema structure in the schema panel.
    The tool bar allows you to add, remove or move columns in your schema. In addition, you can load an XML schema from a file or export the current schema as XML.
    To retrieve a schema based on one of the loaded table schemas, select the DB table schema name in the drop-down list and click Retrieve schema. Note that the retrieved schema then overwrites any current schema and does not retain any custom edits.
    When done, click Finish to complete the database schema creation. All the retrieved schemas will be saved in the corresponding schema folders under the relevant database connection node.
    Now you can drag and drop any table schema of the database connection from the Repository tree view onto the design workspace as a new database component or onto an existing component to reuse the metadata. For more information, see Using centralized metadata in a Job and Setting a repository schema in a Job.