Data Distribution Considerations for Choosing a PI or PA

Data Distribution Considerations for Choosing a PI or PA - Teradata Vantage

Teradata® VantageCloud Lake

Deployment

VantageCloud

Edition

Lake

Product

Teradata Vantage

Published

January 2023

Language

English (United States)

Last Update

2024-04-03

dita:mapPath

phg1621910019905.ditamap

dita:ditavalPath

pny1626732985837.ditaval

dita:id

phg1621910019905

A map associates a unique set of hash values with each of the different AMPs on which the table data is stored. For a PI or PA, the map also determines which AMP gets a table row based on a hash value calculated from the values of the index columns in the row. See Maps.

When the database gets a request, such as a query or data insertion, each AMP is responsible for processing its portion of table data. The AMPs work in parallel to facilitate processing. If one or more AMPs have significantly more or less data to process than others, the database is said to have a skewed data distribution, and performance suffers. Requests cannot be completed until the AMPs with the most data to process have finished their work, reducing the efficiency and benefits of the parallel processing. By choosing an appropriate PI or PA, you can help make sure that table rows are distributed evenly among the AMPs defined by the map used by the table, and take full advantage of Vantage parallel processing.