MAPS Architecture | Database Design | Teradata Vantage - Maps - Advanced SQL Engine

MAPS Architecture | Database Design | Teradata Vantage - Maps - Advanced SQL Engine - Teradata Database

Database Design

Product

Advanced SQL Engine

Teradata Database

Release Number

17.10

Published

July 2021

Language

English (United States)

Last Update

2021-07-27

dita:mapPath

kko1591750222108.ditamap

dita:ditavalPath

kko1591750222108.ditaval

dita:id

B035-1094

lifecycle

Product Category

Teradata Vantage™

Under Vantage MAPS Architecture, every table uses a map that specifies which AMPs store the rows of the table. The map associates each AMP with a set of hash values. For each row of a table with a PI or PA, the values of the index columns are combined and hashed. Based on the map, the resulting hash value determines which AMP receives the row.

Tables and join indexes are assigned a map either explicitly or by default when they are created. For tables with fallback, the primary and fallback copies of the row are stored on different AMPs in the map. Secondary index tables use the same map as their base (indexed) table.

Vantage uses two types of maps to track which rows of a table belong on which AMP:

Contiguous map: This is a type of map that includes all AMPs within a specified range. Vantage creates contiguous maps during a system initialization or reconfiguration. Contact Teradata Support Center personnel for help with defining contiguous maps using the Configuration (config) and Reconfiguration (reconfig) utilities.

Sparse map: This is a type of map that includes a subset of AMPs from a contiguous map. Vantage provides two sparse maps you can use, and you can also create new sparse maps if you have appropriate database privileges.

For more information on a creating a sparse map, see Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144.

Use the DBC.MapsVX view to see a listing of the maps that are available to you. Users with appropriate permissions can create new maps.

In the cases of small tables, data distribution using a large contiguous map may be inefficient and introduce potential skew to the system.

Consider a table that has only 10 rows, and that uses a 1000-AMP contiguous map. A query requiring a full-table scan would require all 1,000 AMPs to attempt to read their rows. The 990 AMPs that store none of the table rows would consume and waste system resources only to determine that they have no rows to process.
Also, consider a table that has 10000 rows on this map. While rows may be evenly distributed, each AMP would only have about 10 rows; this may be inefficient to have 1,000 AMPs reading a small amount of data and processing it.

In these cases, you can specify in a CREATE or ALTER statement that a small table or join index use a sparse map that includes only a subset of the AMPs in a contiguous map.

Sparse Maps and Table Colocation

Rows from tables that have the same PI or PA and that share the same contiguous map are distributed to the same AMPs. This is called table colocation. Colocation provides a performance advantage when tables are joined on their PI or PA columns, because the join processing for corresponding rows happens within the same AMP. This avoids the row replication and redistribution steps that would be required if the corresponding rows from the different tables in the join were stored on different AMPs.

For tables using sparse maps, there is an additional requirement for joins to take advantage of colocation.

To create a sparse map, you specify the parent contiguous map, and the number of AMPs to be included in the new sparse map. The AMPs in the sparse map will be a subset of the AMPs in the parent contiguous map. The specific AMP subset is variable; each table that uses the same sparse map may have its rows distributed to a different subset of AMPs. For example, if many small tables use the same single-AMP sparse map, the data from these tables will not all be distributed to the same AMP. This helps to avoid a skewed data distribution among the AMPs in the parent contiguous map.

Because tables using the same multi-AMP sparse map may not be stored on the same subset of AMPs, even if these tables have the same PI or PA, their corresponding rows would not benefit from colocation during joins. You can force rows of frequently-joined tables to be distributed to the same subset of AMPs by specifying a colocation name when you associate the tables with the sparse map. The colocation name forces tables that use the same sparse map to be stored on the same subset of AMPs, providing the same performance improvement for joins that tables using the same contiguous maps have.

Tools for Managing Maps

Some tables remain fixed in size after they are created and loaded, while others may change frequently. A table should use a map that has a number of AMPs appropriate to the table size, leading to an even distribution of the table rows among the AMPs in the map. Each AMP in the map should have a nearly equal number of rows to process. Tables that use contiguous maps are managed by Vantage. Contiguous maps can be changed and created only during a system reconfiguration. However, tables that use sparse maps should be periodically reassessed to ensure they use a map appropriate for the current table size and anticipated growth.

Teradata provides tools to help you manage maps. These tools can help you evaluate whether tables and maps are appropriately matched and move tables to different maps as necessary. They are especially useful when you expand your database system storage and add AMPs because these tools can migrate existing tables to new maps that take advantage of the new storage and processing capacity. You can use these tools to avoid a full system reconfiguration process, which could require a significant maintenance window to take the system offline and redistribute the table rows among the new AMPs.

Teradata Viewpoint includes portlets that provide a web-based graphical interface to perform many map management tasks. For more information about these portlets, see Teradata® Viewpoint User Guide, B035-2206.
Teradata also provides views and SQL procedures that allow you to manage maps from an SQL command line tool, such as Teradata Studio or BTEQ. The tools consist of two main groups of related procedures:
- Advisor procedures analyze user tables, optionally using information from DBQL-logged query plans, and make recommendations about moving tables to new maps.
- Mover procedures take the output of Advisor procedures and use it to move tables to new maps. This process generates ALTER TABLE statements to assign different maps to tables, and can use multiple concurrent worker sessions to both speed the process and to maintain database availability for other tasks during the changes with a minimal impact on system performance.
For more information on the Advisor and Mover tools and SQL procedures, see Teradata Vantage™ - Database Administration, B035-1093 and Teradata Vantage™ - SQL Operators and User-Defined Functions, B035-1210.
To determine what maps are currently defined on the system, query the DBC.MapsV or DBC.MapsVX views. For more information on Data Dictionary views, see Teradata Vantage™ - Data Dictionary, B035-1092.