Processing Large XML Documents | XML Data Type | Teradata Vantage - Processing Large XML Documents - Analytics Database - Teradata Vantage

XML Data Type

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Teradata Vantage
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2023-10-30
dita:mapPath
tkc1628112506748.ditamap
dita:ditavalPath
qkf1628213546010.ditaval
dita:id
dgs1472251600184
lifecycle
latest
Product Category
Teradata Vantageā„¢

The XML type can accommodate values up to 2 GB in size; however, operations like XSLT and XQuery are only supported on documents that are smaller in size. The methods that can be evaluated in one pass are supported on documents of all sizes. These methods allow for a streaming implementation in which the XML document is traversed in one direction without requiring significant memory resources for holding any state information. This includes methods for parsing and validation, and a constrained implementation of XMLEXTRACT which evaluates the query on subtrees of the large XML document tree, allowing for a streamed processing of the large document.

Operations such as XQuery and XSLT cannot be evaluated in a streamed manner. The documents on which they operate must be loaded into memory. For large XML documents, this can consume a significant amount of memory resources. For example, XSLT needs to load both the document and the stylesheet into memory and requires about 10 times as much memory as the document size. Similarly, the XQuery data model instance representing an XML document can occupy as much memory as 20 to 30 times the size of the document on disk. Therefore, these operations can be supported only on smaller documents that can fit within the memory constraints.

You can specify the maximum amount of memory allowed for such operations using the XML_MemoryLimit DBS Control field setting. An error results if the XML processing operation requires more memory than allowed by this field.

You can also use the XML_MemoryLimit setting in conjunction with Teradata Active System Management (ASM) to limit the number of concurrent invocations of large memory XML functions and methods. Using the Workload Designer portlet in Teradata Viewpoint, you can place filters, throttles, and classification criteria on XML functions and methods. This will control the amount of memory available to XML operations and prevent runaway queries from using up system resources and degrading performance.

Recommendation: If you expect document sizes to be large in relation to the XML_MemoryLimit setting, you should consider XML shredding as an alternate method for storing the XML content to query the data efficiently. Consider XML shredding if query performance is important, because performance for queries on larger documents is generally worse than for smaller documents.