Validating data flows against an XSD file - 7.3

XML validation

Version
7.3
Language
English
Product
Talend Big Data
Talend Big Data Platform
Talend Data Fabric
Talend Data Integration
Talend Data Management Platform
Talend Data Services Platform
Talend ESB
Talend MDM Platform
Talend Real-Time Big Data Platform
Module
Talend Studio
Content
Data Governance > Third-party systems > XML components > XML validation components
Data Quality and Preparation > Third-party systems > XML components > XML validation components
Design and Development > Third-party systems > XML components > XML validation components
Last publication date
2024-02-21

This scenario describes a Job that validates an XML column in the input file ShipOrder.csv against the XSD reference file ShipOrder.xsd and then outputs valid rows into the delimited file ShipOrder_Valid.csv and invalid rows and error messages into the delimited file ShipOrder_Invalid.csv. For a similar use case that validates an XML file, see Validating XML files.

For more technologies supported by Talend, see Talend components.

The content of the input file ShipOrder.csv that includes the XML column ShipOrder to be validated is as follows:

ID;ShipOrder
000001;<shiporder orderid="000001"><orderperson>George Bush</orderperson><shipto><name>John Adams</name><address>Oxford Street</address></shipto><item><title>Empire Burlesque</title><note>Special Edition</note><quantity>1</quantity><price>10.90</price></item></shiporder>
000002;<shiporder orderid="000002"><orderperson>Judy Liu</orderperson><shipto><name>Jack Liu</name><address>Wangfujing Street</address></shipto><item><title>Hide Your Heart</title><quantity>1</quantity><price>9.90</price></item></shiporder>
000003;<shiporder><orderperson>Peter Qian</orderperson><shipto><name>Thomas Wang</name><address>Wangfujing Street</address></shipto><item><title>The Power of Habit</title><quantity>1</quantity><price>8.99</price></item></shiporder>

The content of the XSD reference file ShipOrder.xsd is as follows:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
 <xs:element name="shiporder">
  <xs:complexType>
   <xs:sequence>
    <xs:element name="orderperson" type="xs:string"/>
    <xs:element name="shipto">
     <xs:complexType>
      <xs:sequence>
       <xs:element name="name" type="xs:string"/>
       <xs:element name="address" type="xs:string"/>
      </xs:sequence>
     </xs:complexType>
    </xs:element>
    <xs:element name="item" maxOccurs="unbounded">
     <xs:complexType>
      <xs:sequence>
       <xs:element name="title" type="xs:string"/>
       <xs:element name="note" type="xs:string" minOccurs="0"/>
       <xs:element name="quantity" type="xs:positiveInteger"/>
       <xs:element name="price" type="xs:decimal"/>
      </xs:sequence>
     </xs:complexType>
    </xs:element>
   </xs:sequence>
   <xs:attribute name="orderid" type="xs:string" use="required"/>
  </xs:complexType>
 </xs:element>
</xs:schema>