Character Sets - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

XML documents declare their encoding in the XML declaration. The XML type implementation parses and stores XML in the database with the character data encoded in UTF-8 encoding. XML documents transferred from the client to the server using the text format are expected to be encoded in UTF-8. The encoding specified in the XML declaration is ignored by the XML type implementation on the server side. Similarly, XML type values transferred from server to client in the text format are encoded in UTF-8.

Xerces supports the following encodings out of the box.
  • ASCII
  • UTF-8
  • UTF-16 (big/little endian)
  • UCS4 (big/little endian)
  • EBCDIC code pages IBM037, IBM1047 and IBM1140
  • ISO-8859-1 (Latin1)
  • Windows-1252
In the base case, documents can be loaded and returned as the following:
  • XML type because UTF-8 is supported out of the box by Xerces
  • VARCHAR/CLOB because the transcoding to UTF-8/16 is handled by the database
    CLOB LATIN/UTF16 is only supported on the Block File System on the primary cluster. It is not available for the Object File System.

Documents loaded as BLOBs can only be in the supported encodings.