Data Format - 7.2

Talend Data Mapper User Guide

Talend Documentation Team
Talend Big Data Platform
Talend Data Fabric
Talend Data Management Platform
Talend Data Services Platform
Talend MDM Platform
Talend Real-Time Big Data Platform
Design and Development > Designing Jobs
Talend Studio

The data format describes how the data type is to be concretely manifested. The default data format chooses the best format for the representation. For example, in the EDI representation the date data type and default format will choose the correct data format based on the date qualifier for input documents. The representation specifies the byte order to be used for the entire structure, so it is not necessary to specify the byte order data format individually. However, you can specify the byte order if you want to override it for a specific element.

The following data formats are defined (organized by the data types to which they apply):
  • Integers
    • Default - Numbers are encoded as specified in the representation.

    • Little Endian (PC) - Numbers are encoded in binary in little endian byte format as used by the Intel processors.

    • Big Endian - Numbers are encoded in binary in big endian byte format.

  • Decimal
    • Default - Numbers are encoded as specified in the representation.

    • Character - Numbers are encoded as ASCII characters.

    • Packed Decimal - Numbers are encoded in packed decimal.

    • Zoned Decimal - Number are encoded in zoned decimal.

  • Float/Double
    • Default - Values are encoded using the IEEE 754 standard.

    • IBM Floating Point - Values are encoded using the IBM 360 mainframe standard.

  • Date and Date/Time
    • Default - Dates are encoded as specified in the representation.

    • The remaining date and date/time formats use CC for century, YY for year of century, MM for numeric month of the year, MMM for text month (first 3 characters of the month in English), WW for week of year, W for weak of month, DD for day of month, and DDD for day of year. Other special designators are as noted. The week of year (WW) follows the ISO 8601 rules for week conversion. The date/time formats also use the abbreviations below for the Time type.

  • Time
    • Default - Encoded according to the rules for the representation.

    • The remaining time formats use HH for hour, MM for minute, SS for second and DD for tenth of second.

  • Binary
    • Default - Specifies that the data appears in binary format.

    • Base64 - Encoded into characters with base-64 encoding. (Not currently supported)

    • Hex - Encoded into characters with hex encoding. (Not currently supported)

  • String
    • String (null term) - A string whose value ends with a byte of 0.

When converting dates from an encoding with no century to encodings that require a century, the Java standard rules for deriving the century are followed, which put the year in the century that begins 80 years before the current date.