Processing Large XML Documents

Teradata Vantageā„¢ XML Data Type

brand
Software
Teradata Vantage
prodname
Teradata Database
Teradata Vantage NewSQL Engine
vrm_release
16.20
category
Programming Reference
featnum
B035-1140-162K

The XML type can accommodate values up to 2 GB in size; however, operations like XSLT and XQuery are only supported on documents that are smaller in size. The methods that can be evaluated in one pass are supported on documents of all sizes. These methods allow for a streaming implementation in which the XML document is traversed in one direction without requiring significant memory resources for holding any state information. This includes methods for parsing and validation, and a constrained implementation of XMLEXTRACT which evaluates the query on subtrees of the large XML document tree, allowing for a streamed processing of the large document.

Operations such as XQuery and XSLT cannot be evaluated in a streamed manner. The documents on which they operate must be loaded into memory. For large XML documents, this can consume a significant amount of memory resources. For example, XSLT needs to load both the document and the stylesheet into memory and requires about 10 times as much memory as the document size. Similarly, the XQuery data model instance representing an XML document can occupy as much memory as 20 to 30 times the size of the document on disk. Therefore, these operations can be supported only on smaller documents that can fit within the memory constraints.

You can specify the maximum amount of memory allowed for such operations using the XML_MemoryLimit DBS Control field setting. An error results if the XML processing operation requires more memory than allowed by this field.

You can also use the XML_MemoryLimit setting in conjunction with Teradata Active System Management (ASM) to limit the number of concurrent invocations of large memory XML functions and methods. Using the Workload Designer portlet in Teradata Viewpoint, you can place filters, throttles, and classification criteria on XML functions and methods. This will control the amount of memory available to XML operations and prevent runaway queries from using up system resources and degrading performance.

Recommendation: If you expect document sizes to be large in relation to the XML_MemoryLimit setting, you should consider XML shredding as an alternate method for storing the XML content to query the data efficiently. Consider XML shredding if query performance is important, because performance for queries on larger documents is generally worse than for smaller documents.