17.05 - Usage Notes - Advanced SQL Engine - Teradata Database

Teradata Vantage™ - SQL Data Manipulation Language

Product
Advanced SQL Engine
Teradata Database
Release Number
17.05
Release Date
January 2021
Content Type
Programming Reference
Publication ID
B035-1146-175K
Language
English (United States)

DISTINCT and GROUP BY

For cases when the DISTINCT is semantically equivalent to GROUP BY, the optimizer makes a cost-based decision to eliminate duplicates either by way of a combined sort and duplicate elimination or an aggregation step.

DISTINCT Operator and UDTs

If you specify the DISTINCT operator and the select list specifies a UDT column, the database applies the ordering functionality of that UDT to achieve distinctness.

Depending on the ordering definition, the same query can return different results, similar to how the results of performing case-insensitive DISTINCT processing on character strings can vary. Issuing such a query twice can return different results.
For a UDT example, consider a structured UDT named CircleUdt composed of these attributes:
  • x INTEGER
  • y INTEGER
  • r INTEGER

Suppose that the external type for CircleUdt is a character string containing the ASCII form of the x, y, and r components. The ordering functionality is defined to map and compare the r components of two instance of CircleUdt.

These circles are considered to be equal because their r values are both 9:
  • NEW CircleUdt('1,1,9'), where x=1, y=1, and r=9
  • NEW CircleUdt('6,10,9')

The system sometimes returns the '1,1,9' result and at other times returns the '6,10,9' result as part of the DISTINCT result set.

DISTINCT Operator and Large Objects

You cannot include LOB columns in the select list of a SELECT request if you also specify DISTINCT.

You can specify LOB columns with DISTINCT if you CAST them to an appropriate data type, as documented in this table:

IF the LOB column has this data type … THEN you can reference it with DISTINCT if you CAST it to this data type …
BLOB
  • BYTE
  • VARBYTE
CLOB
  • CHARACTER
  • VARCHAR

SQL Elements That Cannot Be Used With a DISTINCT Operator

The following SQL elements cannot be specified with a request that also specifies a DISTINCT operator.
  • WITH request modifier
  • TOP n operator
  • The recursive statement of a recursive query

    However, you can specify DISTINCT within a nonrecursive seed statement in a recursive query.

Unexpected Row Length Errors With the DISTINCT Operator

Before performing the sort operation required to eliminate duplicates, the database creates a sort key and appends it to the rows to be sorted. If the length of this temporary data structure exceeds the system row length limit of 64 KB, the operation fails and the system returns an error. Depending on the situation, the error message text states that:
  • A data row is too long.
  • Maximum row length exceeded in database_object_name.

For explanations of these messages, see Teradata Vantage™ - Database Messages, B035-1096.

The BYNET only looks at the first 4,096 bytes of the sort key created to sort the specified fields, so if the column the sort key is based on is greater than 4,096 bytes, the key is truncated and rows in the set can either be identified as duplicates when they are not or identified as unique when they are duplicates.