Users query their databases not because they need an answer as an end result; rather, they want guidance for performing a business action that will have optimal decisive effects. The actions taken cause your bottom line to change, not the answers that informed those actions.
A query can be simple or it can be complex. The answers provided by simple queries tend to be more expected than not, while the answers provided by complex queries tend to produce far less certain answers that are that much richer for resolving the questions they pose.
Consider the following case example from the financial industry. A bank offers its customers a financial instrument that a competitor has decided to offer “without charge.” This challenge must be responded to quickly or the bank will lose many of its customers who are currently paying what they see as an unnecessary fee for the financial instrument under discussion.
Suppose the bank responds to this situation by asking a simple question such as “which current customers use this product”? The most likely response to this information, which is merely a list of current customers, would be to eliminate the fee for the product. While this action is likely to forestall the erosion of the customer base, it also has the negative effect of reducing profits for the bank.
Suppose, instead, that the bank asks a more sophisticated question such as “which current customers using this product would remain profitable clients even if the fee were eliminated”? The bank now knows not only which consumers of its financial instrument are profitable for reasons other than their consumption of that product, but also knows which consumers do not otherwise contribute to the bottom line. The latter customer set can be released to the competition, which, by the way, almost certainly does not know that the new customers it is luring away from our bank are not profitable.
The impact of this more complex query is profit maximization, and its example illustrates very clearly the value of complex over simple queries.
Relationship Between Query Complexity and the Value of Its Answer
A simple query typically accesses only a few tables, as illustrated by the following graphic.
The simplicity of such a query maps directly into the simplicity of the answer it returns. In other words, simple queries tend to deliver low-value answers which, in turn, enable low-value actions.
More complex queries, on the other hand, investigate the multivariate relationships among many tables in search of the high-value answers that come from mining the many interrelationships among the tables accessed. The following graphic illustrates the concept of a moderately complex query. The example winds its way through four individual tables, accessing one of them several times. A more realistic complex query could easily access hundreds of tables, including as many as 128 of them in a single join!
Query complexity exerts a processing an I/O burden that many commercial relational database management systems are not capable of handling, which is the principal reason that most data warehouse vendors advocate the use of summary data.
The relationship between query complexity and the extent of detail in the database is direct and profound. While summary data is often good enough to answer simple queries, it cannot deliver the sorts of answers that more complex queries seek. This is an extremely important concept to understand before designing your databases, because you need to provide the level of detail in the data that can deliver answers to the types of questions you will be asking.
Valuable Information and Time
The value of information is often inversely proportional to the length of time required to derive it.
As the following graphic illustrates, the more sophisticated the analysis, the less expected the answers obtained. More explicitly, this principle can be stated as the more complex the query, the more likely that heretofore unknown information hidden in the data is revealed.
Sophisticated explorations of the data universe might take longer to produce results, but when those results are finally produced, they are invaluable to the business. A quick response is a minimal requirement for simple queries, but such queries rarely provide a business-critical response, and designing a database to ensure nothing but quick responses is the quickest path to failure for your data warehouse project. The parallel architecture built into Vantage ensures that all queries are answered in an optimal time frame.
Your data warehouse can be a source of unimaginable information richness, but only if it is designed with the thought that any possible question, no matter how involved or abstract, should be possible to answer as readily as a simple query.