XML shredding is the process of extracting values from XML documents to populate tables in the database. One way to define the mapping from the XML document's tree structure to the database's tabular structure is to use an annotated schema. An alternative is to use an XSLT stylesheet-based mapping.
For efficient memory use, streamed shredding does not load the entire document into memory. XSLT-based shredding loads the document in memory, giving it more flexibility than one-pass processing over the document.
XSLT-based shredding is accomplished using shredding stored procedures:
- XSLT_SHRED - used for single XML document shredding
- XSLT_SHRED_BATCH - used to shred multiple XML documents
Perform these typical tasks to shred XML documents.
- Load the XML document to be shredded into an XML column of a staging table.The XML document must conform to W3C XML standards. The following is an example of such an XML document.
<<?xml version="1.0"?> <Root> <predictixOfferMessage> <offer> <offerId>1000002</offerId> </offer> <mediaBlock> <mediaBlockId>90000000010002</mediaBlockId> </mediaBlock> </predictixOfferMessage> </Root>
- Define an SQL query that returns a result set with two columns:
- An ID column
- The XML column that contains the XML document to be shredded
- Create an XSLT stylesheet that defines how the XML document is shred to the target tables.
- Store the XSLT stylesheet so that it can be easily referenced; for example, store it in a stylesheet repository table.
- Call the XSLT_SHRED_BATCH stored procedure with the following arguments:
- The SQL query you defined in step 2
- The XSLT stylesheet you saved in step 4
The XML documents returned by the SQL query are shredded based on the mapping in the XSLT stylesheet. This results in the target tables being populated with data from these documents. NULL is inserted into target columns in some cases (for example, where the corresponding elements in the XML are empty or missing). NULL is stored if the given element path is missing in the input XML document.