Setting up a generic schema from an XML file - Cloud - 8.0

Talend Studio User Guide

Version
Cloud
8.0
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Cloud
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Design and Development
Last publication date
2024-02-29

About this task

Warning:

The source XML file from which you can create a generic schema must be an export of schema from Talend Studio or an XML with the same XML tree structure, not any other kind of XML.

To create a generic schema from a source XML file, proceed as follows:

Procedure

  1. Right-click Generic schemas in the Repository tree view, and select Create generic schema from xml.
  2. In the dialog box that appears, choose the source XML file from which the schema is taken and click Open.
  3. In the schema creation wizard that appears, define the schema Name or use the default one (metadata) and give a Comment if any.
    The schema structure from the source file is displayed in the Schema panel. You can customize the columns in the schema as needed.
    The tool bar allows you to add, remove or move columns in your schema.
    Warning: Avoid using any Java reserved keyword as a schema column name.
    Make sure the data type in the Type column is correctly defined.
    For more information regarding Java data types, including date pattern, see Java API Specification.
    Below are the commonly used Talend data types:
    • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

    • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

    • Dynamic: a data type that can be set for a single column at the end of a schema to allow processing fields as VARCHAR(100) columns named either as ‘Column<X>’ or, if the input includes a header, from the column names appearing in the header. For more information, see Dynamic schema.

    • Document: a data type that allows processing an entire XML document without regarding to its content.

  4. Click Finish to complete the generic schema creation. The created schema is displayed under the relevant Generic schemas node.