Skip to main content Skip to complementary content

Finalizing the end schema

About this task

The schema generated displays the columns selected from the XML file and allows you to further define the schema.

File - Step 5 of 5 dialog box.

Procedure

  1. If needed, rename the metadata in the Name field (metadata, by default), add a Comment, and make further modifications, for example:
    • Redefine the columns by editing the relevant fields.

    • Add or delete a column using the add and remove buttons.

    • Change the order of the columns using the upward and downward buttons.

    Information noteWarning: Avoid using any Java reserved keyword as a schema column name.
    Make sure the data type in the Type column is correctly defined.
    For more information regarding Java data types, including date pattern, see Java API Specification.
    Below are the commonly used Talend data types:
    • Object: a generic Talend data type that allows processing data without regard to its content, for example, a data file not otherwise supported can be processed with a tFileInputRaw component by specifying that it has a data type of Object.

    • List: a space-separated list of primitive type elements in an XML Schema definition, defined using the xsd:list element.

    • Dynamic: a data type that can be set for a single column at the end of a schema to allow processing fields as VARCHAR(100) columns named either as ‘Column<X>’ or, if the input includes a header, from the column names appearing in the header. For more information, see Dynamic schema.

    • Document: a data type that allows processing an entire XML document without regarding to its content.

  2. If the XML file which the schema is based on has been changed, click the Guess button to generate the schema again. Note that if you have customized the schema, the Guess feature does not retain these changes.
  3. Click Finish. The new file connection, along with it schema, appears under the File XML node in the Repository tree view.

Results

Now you can drag and drop the file connection or any schema of it from the Repository tree view onto the design workspace as a new tFileInputXML or tExtractXMLField component or onto an existing component to reuse the metadata. For further information about how to use the centralized metadata in a Job, see Using centralized metadata in a Job and Setting a repository schema in a Job.

To modify an existing file connection, right-click it from the Repository tree view, and select Edit file xml to open the file metadata setup wizard.

To add a new schema to an existing file connection, right-click the connection from the Repository tree view and select Retrieve Schema from the contextual menu.

To edit an existing file schema, right-click the schema from the Repository tree view and select Edit Schema from the contextual menu.

Did this page help you?

If you find any issues with this page or its content – a typo, a missing step, or a technical error – please let us know!