Processing Large XML Documents | XML Data Type | VantageCloud Lake - Processing Large XML Documents - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-04-03
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

The XML type can accommodate values up to 2 GB in size; however, operations like XSLT and XQuery are only supported on documents that are smaller in size. The methods that can be evaluated in one pass are supported on documents of all sizes. These methods allow for a streaming implementation in which the XML document is traversed in one direction without requiring significant memory resources for holding any state information. This includes methods for parsing and validation, and a constrained implementation of XMLEXTRACT which evaluates the query on subtrees of the large XML document tree, allowing for a streamed processing of the large document.

Operations such as XQuery and XSLT cannot be evaluated in a streamed manner. The documents on which they operate must be loaded into memory. For large XML documents, this can consume a significant amount of memory resources. For example, XSLT must load both the document and the stylesheet into memory and requires about 10 times as much memory as the document size. Similarly, the XQuery data model instance representing an XML document can occupy as much memory as 20 to 30 times the size of the document on disk. Therefore, these operations can be supported only on smaller documents that can fit within the memory constraints.

You can also use the XML_MemoryLimit setting with Teradata Active System Management (ASM) to limit the number of concurrent invocations of large memory XML functions and methods.

Recommendation: If you expect document sizes to be large in relation to the XML_MemoryLimit setting, consider XML shredding as an alternate method for storing the XML content to query the data efficiently. Consider XML shredding if query performance is important, because performance for queries on larger documents is worse than for smaller documents.

Related Information

Topic Reference
XML shredding process XML Shredding Based on a Schema.
Teradata ASM and workload management