The two common commercial uses for relational database management systems are online transaction processing (OLTP) and data warehousing (DW). The access patterns of these two approaches are very different and they make very different demands on the underlying SQL Engine.
The next few topics examine the different access patterns of OLTP and DW processing. Once the two styles of processing have been compared, you should readily recognize just how dramatically different they are.
OLTP
Consider the following simple example: A customer, Mr. Brown, orders a calendar. The graphic diagrams the actions taken against the database.
The transaction flow is outlined in the following process stages:
- A new order, 7326, is opened on the Order table.
- The remaining columns for the order are filled out, including the Customer number for Mr. Brown, 3, and the date the order was placed, April 5.
- A new shipping order, with order number 7326 and item number 3509, is opened on the Order Item Shipped table.
- The remaining columns for the shipper are filled out, including the quantity shipped, 1, and the date the order was shipped, April 5.
- The transaction is complete.
Three key aspects of this transaction deserve your attention:
- Only a few of many possible tables were accessed.
- None of the accessed tables, some of which might have billions of rows, were scanned.
- Very little I/O processing was required to complete the transaction.
These three attributes are characteristic of OLTP environments. The requirements are simple and involve only a quick-in, quick-out approach to the database. Many relational database engines are designed specifically for maximum efficiency with these kinds of workloads.
Data Warehousing
Consider the following simple decision support example using the same database to emphasize the contrast in processing requirements: A business analyst needs to find the answer to a specific question about pencil customers so he can target them for a promotion. The natural language query the analyst performs is this: which customers placed the majority of their orders for pencils in the first quarter of this year? The graphic diagrams the actions taken against the database.
The processing flow is outlined in the following process stages:
- The SQL Engine interrogates the Item table for the item number for pencils, which is 3421.
- The engine next scans the Order Item Shipped table looking for all orders associated with item number 3421.
- The particular row examined in this example shipped 0 pencils, indicating that the order was back ordered.
- The engine accesses the Order Item Backordered table row having an order and item number matching that examined in stage 3 and finds that 144 pencils were back ordered.
- The engine must now locate the customer who placed this order because the whole point of the query was to identify customers who had ordered pencils in the first quarter of the year.
To do this, the engine must access the Order table, where it finds that customer number 2 placed the order in question.
- The engine accesses the Customer table to identify the name of the customer associated with customer number 2, which is James.
- The large number of tables that had to be accessed in order to answer it.
This complex access path through many tables is a hallmark of decision support analysis. And remember: this is an extremely simple example!
- The tables were not just touched lightly in response to the query, but required massive searches, and sometimes multiple scans are required.
- Once the data required to answer the query has been gathered, it must be processed further using aggregation, joins, sorts, conditional requirements, and so on.
All this processing is heavily resource-intensive. The following table presents a summary contrast of DW and OLTP processing.
OLTP Transaction | Attribute | DW Query |
---|---|---|
No | Multiple tables scanned | Yes |
No | Large volumes of data examined | Yes |
No | Processing-intensive | Yes |
No | Response time is a function of database size | Yes |