17.10 - Scalability Considerations for Tactical Queries - Advanced SQL Engine - Teradata Database

Teradata Vantageā„¢ - Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Release Date
July 2021
Content Type
User Guide
Publication ID
B035-1094-171K
Language
English (United States)

Scalability Is a Relative Concept

In traditional decision support applications, setting the stage for scalable performance with Vantage involves distributing the work equally across all the AMPs in the system. As more nodes are added to a system, the number of AMPs increases proportionally, and the effort to process a complex query is spread across more parallel units, reducing response times. The following plot indicates linear scalability, graphing the total workload accomplished as a function of the number of nodes in the system. Strategic queries run proportionally faster as more nodes are added to a system.


Linear scalability

Spreading the data evenly across all AMPs remains the goal of designing the database to support tactical queries. But to achieve the type of scalability that best supports tactical queries, you need to localize the work in such a way that it accesses only a few rows using the least resources possible. This means accessing the smallest number of AMPs possible to perform an operation; optimally only one. The following figure indicates that when tactical queries access only a few AMPs, more of them can run concurrently.


Strategic and tactical query comparison

Effect of Data Volume Growth on Tactical Query Response Times

Query times for complex decision support are directly impacted by increases in the data volume. Because a large portion, if not all, of a table might need to be read, if that table doubles in size, then the effort required to process the work can also double. On the other hand, a query that accesses one or a few rows in the database is not impacted by changes to the data volume. In most cases, a tactical query accesses one or few rows whether the table those rows are in is 1 GB or 10 TB in size.

Effect of Growth in Concurrent Users on Tactical Query Response Times

Raising the level of concurrent users doing the same work in traditional decision support tends to slow response times for the current users because more demands are now being made on the same resources, and all are spread out across all nodes and AMPs in the system. In contrast, increasing the number of users performing tactical queries that are very localized and limited in their resource use boosts overall throughput up to the point of system saturation.

Effect of Configuration Expansion on Tactical Query Response Times

For a complex query, adding nodes translates to a proportional decrease in response time because the work is distributed across a larger number of AMPs. At the same time, the response time for a query that accesses a single AMP, as many tactical queries do, is not affected by the number of AMPs in the configuration.

There is a one benefit gained by single-AMP tactical queries when nodes are added to a configuration: more of them can be performed at the same time and still deliver short turnaround times.

Consider the following contrived example: Assume you have a table with a cardinality of 1 million rows. The read capacity of each node in your configuration is 100 rows per second. If you are doing a complex strategic query that involves a full table scan of this table, then the response time for the query diminishes proportionally with the increase in nodes, as illustrated by the data in the following table:

Number of Nodes Number of Rows per Node Response Time (seconds)
                      1                   1,000,000                        10,000
                    10                      100,000                          1,000
                  100                        10,000                             100
                  200                          5,000                               50

This performance enhancement occurs because each node has fewer rows as more nodes are added, so each node can perform its portion of the scan faster.

On the other hand, if your application is performing single- or few-AMP tactical queries, adding nodes does not shorten the response time. However, it does increase the number of tactical queries that can be performed in a given interval of time, as indicated by the data in the following table:

Number of Nodes Response Time (seconds) Throughput (requests/second)
                    1 0.01                                      100
                 10                                   1,000
               100                                 10,000
               200                                 20,000