Rules and Restrictions for the ORDER BY Clause
The rules and restrictions for the use of ORDER BY are:
- Each column_name you specify in an ORDER BY clause must be the name of a column in a relation referenced in the SELECT expression list. The columns named do not have to match the columns in the SELECT expression list.
You can also order on a column_name_alias.
The maximum number of column names you can specify is 64.
You cannot specify BLOB or CLOB columns in the column name list because you cannot sort on LOB values.
- You cannot specify case sensitivity for a CLOB column.
For details on defining case sensitivity for a column, see SQL Data Types and Literals .
- The column position you specify in an ORDER BY clause must be an unsigned integer that refers to the sequential position of a column in the SELECT expression list. The column position cannot be passed as a parameter; attempts to do so cause it to be interpreted as an expression, and the data to be sorted by a constant.
You cannot specify BLOB or CLOB columns in the column position list because you cannot sort on LOB values.
- If an ORDER BY specification references a UDT column, then Teradata Database uses the defined ordering functionality for that UDT to order the result set.
- You can specify a scalar subquery as a column expression in the ORDER BY clause of a query.
- You can specify an ORDER BY clause before or after any WITH clause. Also see “ORDER BY Clause”.
When you specify both WITH and ORDER BY clauses in the same SELECT request, they function together as follows:
- The WITH clause defines the major sort key.
- The ORDER BY clause defines the minor sort key.
This is true regardless of the structure of the query or the number of WITH clauses. See Example: Combining WITH and ORDER BY Clauses.
- If you have multiple ON clauses, you cannot specify ORDER BY as the only option in an ON clause. You must combine it with a PARTITION BY (ANY), HASH BY, or DIMENSION clause.
- If you specify an ORDER BY or LOCAL ORDER BY clause with PARTITION BY or HASH BY input, then the following is required:
- All inputs must have the same number of ORDER BY columns.
- The data types of the columns must be the same type or matched using implicit cast.
Unexpected Results
This section describes some unexpected results that can occur when using the ORDER BY clause. They are:
- Unexpected row length errors.
- Unexpected sort order when querying DATE data type columns. See Unexpected Sort Order When Querying DATE Data Type Columns.
Unexpected Row Length Errors: Sorting and Default Sort Order
Before performing the sort operation that orders the rows to be returned, Teradata Database creates a sort key and appends it to the rows. If the length of the sort key exceeds the system row maximum length of 1 MB, the operation returns an error. Depending on the situation, the error message text is one of the following.
- A data row is too long.
- Maximum row length exceeded in database_object_name.
For explanations of these messages, see Messages .
There are several possible reasons why a data row plus BYNET sort key might unexpectedly exceed the spool row size limit of 1 MB, even without including any updates.
- The value for the MaxDecimal DBS Control field for your system might have been changed to a larger value, which could increase the number of bytes stored for each value in a DECIMAL column to as much as 38 bytes where it might previously have been only 1, 2, 4, or 8 bytes. See Database Design and SQL Data Types and Literals .
- Your site has upgraded from a system with a 32-bit aligned row architecture to a system with a 64-bit architecture. This upgrade expands the size of data, both on disk and in memory, for a number of data types. The expansion is due to byte alignment restrictions that require padding to ensure alignment on 8-byte boundaries. These restrictions do not exist for 32-bit architectures. Therefore, the same row can be within the spool row 1 MB size boundary on a 32-bit system, but exceed that limit on a 64-bit system.
Note that this issue is not relevant if your system uses the packed64 row architecture rather than the aligned row format architecture. For details, see Database Design .
By default, nulls sort before all other data in a column. In other words, nulls sort low. For example, suppose you want to sort the following set of values: 5, 3, null, 1, 4, and 2.
The values sort as follows:
null 1 2 3 4 5
By default, the result values of a character expression are sorted in ascending order, using the collating sequence in effect for the session.
Unexpected Sort Order When Querying DATE Data Type Columns
Using the ORDER BY clause to query columns with the DATE data type may yield results in which the order of the returned values seem to be out of order.
The factors that contribute to this type of unexpected result are:
- DATE time values are stored by the system on disk in UTC time, not local time.
- UTC time values are used to sort the column values, not the local time values.
- DATE time values are affected by the system time zone setting, which can be set using TimeZoneString or the DBSControl general field 16.
- DATE time values that cross the date boundary are adjusted based on the time zone offset for the system time zone (the time zone offset varies depending on the time zone).
For possible workarounds, see Workarounds for Unexpected Sort Order When Querying DATE.
Example of Unexpected Sort Order
This example shows a result set of TIME data type column values that do not seem to be sorted in the order specified in the ORDER BY clause. The local time values inserted into the column are followed by the values returned by the query.
The time zone in this example is Pacific Time, which has a +8 hour time zone offset.
CREATE TABLE table_1 (column_1 INTEGER, column_2 TIME); INSERT INTO table_1 VALUEs ( 0, TIME'00:00:00.000000'); INSERT INTO table_1 VALUEs ( 1, TIME'01:00:00.000000'); INSERT INTO table_1 VALUEs ( 2, TIME'02:00:00.000000'); INSERT INTO table_1 VALUEs ( 3, TIME'03:00:00.000000'); INSERT INTO table_1 VALUEs ( 4, TIME'04:00:00.000000'); INSERT INTO table_1 VALUEs ( 5, TIME'05:00:00.000000'); INSERT INTO table_1 VALUEs ( 6, TIME'06:00:00.000000'); INSERT INTO table_1 VALUEs ( 7, TIME'07:00:00.000000'); INSERT INTO table_1 VALUEs ( 8, TIME'08:00:00.000000'); INSERT INTO table_1 VALUEs ( 9, TIME'09:00:00.000000'); INSERT INTO table_1 VALUEs (10, TIME'10:00:00.000000'); INSERT INTO table_1 VALUEs (11, TIME'11:00:00.000000'); INSERT INTO table_1 VALUEs (12, TIME'12:00:00.000000'); INSERT INTO table_1 VALUEs (13, TIME'13:00:00.000000'); INSERT INTO table_1 VALUEs (14, TIME'14:00:00.000000'); INSERT INTO table_1 VALUEs (15, TIME'15:00:00.000000'); INSERT INTO table_1 VALUEs (16, TIME'16:00:00.000000'); INSERT INTO table_1 VALUEs (17, TIME'17:00:00.000000'); INSERT INTO table_1 VALUEs (18, TIME'18:00:00.000000'); INSERT INTO table_1 VALUEs (19, TIME'19:00:00.000000'); INSERT INTO table_1 VALUEs (20, TIME'20:00:00.000000'); INSERT INTO table_1 VALUEs (21, TIME'21:00:00.000000'); INSERT INTO table_1 VALUEs (22, TIME'22:00:00.000000'); INSERT INTO table_1 VALUEs (23, TIME'23:00:00.000000');
This SELECT statement is used to query the time values in the TIME data type column (column_2).
SELECT * FROM table_1 ORDER BY column_2; column_1 column_2 -------- --------------- 16 16:00:00.000000 17 17:00:00.000000 18 18:00:00.000000 19 19:00:00.000000 20 20:00:00.000000 21 21:00:00.000000 22 22:00:00.000000 23 23:00:00.000000 0 00:00:00.000000 1 01:00:00.000000 2 02:00:00.000000 3 03:00:00.000000 4 04:00:00.000000 5 05:00:00.000000 6 06:00:00.000000 7 07:00:00.000000 8 08:00:00.000000 9 09:00:00.000000 10 10:00:00.000000 11 11:00:00.000000 12 12:00:00.000000 13 13:00:00.000000 14 14:00:00.000000 15 15:00:00.000000
Because the sort order is based on UTC time (which is not visible in the column), the time values do not agree with the resulting sort order. Rows 16 through 23 appear at the top of the list because they belong at the top of the list in terms of UTC time.
The UTC equivalents for the returned time values are:
Returned Values UTC Equivalents (local time + 8 hours)
column_1 column_2 -------- ------------- 16 16:00:00.000000 00:00:00.000000 17 17:00:00.000000 01:00:00.000000 18 18:00:00.000000 02:00:00.000000 19 19:00:00.000000 03:00:00.000000 20 20:00:00.000000 04:00:00.000000 21 21:00:00.000000 05:00:00.000000 22 22:00:00.000000 06:00:00.000000 23 23:00:00.000000 07:00:00.000000 0 00:00:00.000000 08:00:00.000000 1 01:00:00.000000 09:00:00.000000 2 02:00:00.000000 10:00:00.000000 3 03:00:00.000000 11:00:00.000000 4 04:00:00.000000 12:00:00.000000 5 05:00:00.000000 13:00:00.000000 6 06:00:00.000000 14:00:00.000000 7 07:00:00.000000 15:00:00.000000 8 08:00:00.000000 16:00:00.000000 9 09:00:00.000000 17:00:00.000000 10 10:00:00.000000 18:00:00.000000 11 11:00:00.000000 19:00:00.000000 12 12:00:00.000000 20:00:00.000000 13 13:00:00.000000 21:00:00.000000 14 14:00:00.000000 22:00:00.000000 15 15:00:00.000000 23:00:00.000000
Workarounds for Unexpected Sort Order When Querying DATE
These workarounds can be used to get expected results:
- Use INTERVAL
- Use CAST to CHAR or VARCHAR
- Use TIMESTAMP instead of TIME
Use INTERVAL
SELECT column_1, column_2, column_2 - interval '8' hour FROM table_1 ORDER BY 3; column_1 column_2 (column_2- 8) -------- -------------- -------------- 0 00:00:00.000000 16:00:00.000000 1 01:00:00.000000 17:00:00.000000 2 02:00:00.000000 18:00:00.000000 3 03:00:00.000000 19:00:00.000000 4 04:00:00.000000 20:00:00.000000 5 05:00:00.000000 21:00:00.000000 6 06:00:00.000000 22:00:00.000000 7 07:00:00.000000 23:00:00.000000 8 08:00:00.000000 00:00:00.000000 9 09:00:00.000000 01:00:00.000000 10 10:00:00.000000 02:00:00.000000 11 11:00:00.000000 03:00:00.000000 12 12:00:00.000000 04:00:00.000000 13 13:00:00.000000 05:00:00.000000 14 14:00:00.000000 06:00:00.000000 15 15:00:00.000000 07:00:00.000000 16 16:00:00.000000 08:00:00.000000 17 17:00:00.000000 09:00:00.000000 18 18:00:00.000000 10:00:00.000000 19 19:00:00.000000 11:00:00.000000 20 20:00:00.000000 12:00:00.000000 21 21:00:00.000000 13:00:00.000000 22 22:00:00.000000 14:00:00.000000 23 23:00:00.000000 15:00:00.000000
Use CAST to CHAR or VARCHAR
SELECT column_1, column_2, CAST(column_2 AS CHAR(15)) FROM table_1 ORDER BY 3; column_1 column_2 column_2 -------- ------------ --------------- 0 00:00:00.000000 00:00:00.000000 1 01:00:00.000000 01:00:00.000000 2 02:00:00.000000 02:00:00.000000 3 03:00:00.000000 03:00:00.000000 4 04:00:00.000000 04:00:00.000000 5 05:00:00.000000 05:00:00.000000 6 06:00:00.000000 06:00:00.000000 7 07:00:00.000000 07:00:00.000000 8 08:00:00.000000 08:00:00.000000 9 09:00:00.000000 09:00:00.000000 10 10:00:00.000000 10:00:00.000000 11 11:00:00.000000 11:00:00.000000 12 12:00:00.000000 12:00:00.000000 13 13:00:00.000000 13:00:00.000000 14 14:00:00.000000 14:00:00.000000 15 15:00:00.000000 15:00:00.000000 16 16:00:00.000000 16:00:00.000000 17 17:00:00.000000 17:00:00.000000 18 18:00:00.000000 18:00:00.000000 19 19:00:00.000000 19:00:00.000000 20 20:00:00.000000 20:00:00.000000 21 21:00:00.000000 21:00:00.000000 22 22:00:00.000000 22:00:00.000000 23 23:00:00.000000 23:00:00.000000
Use TIMESTAMP instead of TIME
CREATE TABLE table_2 (column_1 INTEGER, column_2 TIMESTAMP); INSERT INTO table_2 SELECT * FROM table_1; SELECT * FROM table_2 ORDER BY column_2; column_1 column_2 -------- -------------------------- 0 2011-11-07 00:00:00.000000 1 2011-11-07 01:00:00.000000 2 2011-11-07 02:00:00.000000 3 2011-11-07 03:00:00.000000 4 2011-11-07 04:00:00.000000 5 2011-11-07 05:00:00.000000 6 2011-11-07 06:00:00.000000 7 2011-11-07 07:00:00.000000 8 2011-11-07 08:00:00.000000 9 2011-11-07 09:00:00.000000 10 2011-11-07 10:00:00.000000 11 2011-11-07 11:00:00.000000 12 2011-11-07 12:00:00.000000 13 2011-11-07 13:00:00.000000 14 2011-11-07 14:00:00.000000 15 2011-11-07 15:00:00.000000 16 2011-11-07 16:00:00.000000 17 2011-11-07 17:00:00.000000 18 2011-11-07 18:00:00.000000 19 2011-11-07 19:00:00.000000 20 2011-11-07 20:00:00.000000 21 2011-11-07 21:00:00.000000 22 2011-11-07 22:00:00.000000 23 2011-11-07 23:00:00.000000
Specifying Collation
Different languages use different collation sequences, and you might need to alter your system collation to ensure that the ORDER BY clauses you specify in your queries sort correctly.
You can specify the default collation for a user using the CREATE USER or MODIFY USER statement, or you can specify collation for the duration of a session using the SET SESSION COLLATION statement. Otherwise, collation for a session is determined by the logon client system. The direction of the sort can be controlled by including the DESC (descending) or ASC (ascending) sort option in the SQL request.
Teradata Database supports ASCII, EBCDIC, and MULTINATIONAL collating sequences. If MULTINATIONAL is in effect, your collation will be one of the European (diacritical) or Kanji sort sequences described in International Sort Orders.
The following topics explain the results of ORDER BY as affected by whether character string expressions have the CASESPECIFIC or NOTCASESPECIFIC attribute.
Japanese Character Sort Order Considerations
If character strings are to be sorted NOT CASESPECIFIC, which is the default in Teradata session mode, only lowercase simple letters (for example, the characters a … z in the Latin alphabet) are converted to uppercase before a comparison or sorting operation is done.
The important thing to understood is that case differences are ignored in NOT CASESPECIFIC collations. In other words, the case of the characters is not critical for NOT CASESPECIFIC collation.
Any non-Latin single-byte character, any multibyte character, and any byte indicating a transition between single-byte characters and multibyte characters is excluded from this function.
If the character strings are to be sorted CASESPECIFIC, which is the default in ANSI session mode, then the case of the characters is critical for collation.
For Kanji1 character data in CASESPECIFIC mode, the letters in any alphabet that uses casing, such as Latin, Greek, or Cyrillic, are considered to be matched only if they are the identical letters and have the same case.
The system does not consider FULLWIDTH and HALFWIDTH characters to be matched whether in CASESPECIFIC or NOT CASESPECIFIC mode.
Teradata Database uses one of four server character sets to support Japanese characters:
- Unicode
- KanjiSJIS
- Kanji1
- Graphic
If you are at a Japanese character site and you want to ensure that sorting of character data is as close as possible to that of the client, you should set your session collation (see “SET SESSION COLLATION” in SQL Data Definition Language ) as follows:
- For character data stored on the Teradata platform as KanjiSJIS or Unicode, the best way to order the session character set is to use the CHARSET_COLL collation.
For character data stored on the Teradata platform as either KanjiSJIS or Unicode, the CHARSET_COLL collation provides the collation that is closest to sorting on the client.
- The JIS_COLL collation also provides an adequate collation, and has the added virtue of providing the same collation no matter what the session character set is.
- The CHARSET_COLL and JIS_COLL collation sequences are not designed to support Kanji1 character data.
For Kanji1 character data, the ASCII collation provides a collation similar to that of the client, assuming the session character set is KanjiSJIS_0S, KanjiEUC_0U, or something very similar. However, ASCII collation does not sort like the client if the data is stored on the Teradata platform using the ShiftJIS or Unicode server character sets.
- Users under the KATAKANAEBCDIC, KANJIEBCDIC5026_0I, or KANJIEBCDIC5035_0I character sets who want to store their character data on the Teradata platform as Kanji1, and who want to collate in the session character set should install either KATAKANAEBCDIC, KANJIEBCDIC5026_0I, or KANJIEBCDIC5035_0I, respectively, at start-up time, and use MULTINATIONAL collation.
Note: Each character set requires a different definition for its MULTINATIONAL collation to collate properly.
Teradata Database handles character data collation for the different Japanese server character sets as follows:
- Under the KanjiEUC character set, the ss3 0x8F is converted to 0xFF. This means that a user-defined KanjiEUC codeset 3 are not ordered properly with respect to other KanjiEUC code sets. The order of other KanjiEUC code sets is proper (that is, ordering is the same as the binary ordering on the client system).
ASCII collation collates Kanji1 data in binary order, but handles case matching of single-byte or HALFWIDTH Latin characters (see International Character Set Support for details). This matches collation on the client for KanjiSJIS_0S sessions, is close for KanjiEUC_0U, and is only reasonably close for double-byte data for KanjiEBCDIC sessions. Note that KanjiEUC_0U puts code set 3 after code set 1 rather than before code set 1 as KanjiEBCDIC does.
- For Kanji1 data, characters identified as multibyte characters remain in the client encoding and are collated based on their binary values. This explains why ASCII collation works for double-byte characters in KanjiEBCDIC sessions.
Note that it is not true that multibyte Kanji1 characters remain in the client encoding for KanjiEUC_0U sessions.
For details, see International Character Set Support and SQL Data Types and Literals.
International Sort Orders
MULTINATIONAL collation is enabled at the user level via the COLLATION option of the CREATE USER or MODIFY USER statement. If a collation is not specified, the default is HOST; that is, the standard sort ordering of the client system (EBCDIC for IBM clients, ASCII for all others).
You can override the default at the session level by issuing the SET SESSION COLLATION statement.
When MULTINATIONAL collation is in effect, the default collation sequence is determined by the collation setting installed at system start-up. Also see International Character Set Support .
Each international sort sequence is defined by program constants and no check is made to ensure that collation is compatible with the character set of the current session. The choice of sequence is controlled by your database administrator. The programmed character sequences cannot be changed.
European Sort Order
The sort order uses a two-level comparison that involves the following rules.
- All lowercase letters are first mapped to their uppercase equivalents unless CASESPECIFIC is specified in the ORDER BY clause or was defined for the column being accessed.
In ANSI session mode, the default is CASESPECIFIC and you must explicitly specify NOT CASESPECIFIC to change this.
- All diacritical forms of the same letter are given the value of the base letter; that is, Ä is given the value of A, Ö is given the value of O, and so forth.
- If two strings produce the same value, the characters are further ordered according to their sort position in their diacritical equivalence class (see the table below).
- Unless the query specifies the DESC sort option, collation is in ascending order.
For details, see SQL Data Types and Literals .
When these rules are applied, the words “abbey,” “Active,” and “adage” are returned in this order,
abbey Active adage
and the names Muller, Handl, Böckh, Mueller, Händl, Bohr, Bock, and Müller are ordered as:
Bock Böckh Bohr Handl Händl Mueller Muller Müller
Equivalence classes and the ordering of diacritical characters in each class are shown in this table. The listed classes are those with characters that have diacritical forms.
European Sort Order | |||||
---|---|---|---|---|---|
A | C | E | I | N | O |
a A à À á Á â Â ã Ã ä Ä |
c C ç Ç |
e E è È é É ê Ê ë Ë |
i I í Ì ì Í î Î ï Ï |
n N ñ Ñ |
o O ò Ò ó Ó ô Ô õ Õ ö Ö o O |
S | U | Y | AE | O slash | A ring |
s S β |
u U ù Ù ú Ú û Û (U tilde) ü Ü |
y Y ÿ Ÿ |
æ Æ | ø Ø | å Å |