UNION Operator
Purpose
Combines two or more SELECT results tables into a single result.
Syntax
where:
Syntax element … |
Specifies … |
query_expression_1 |
a complete SELECT statement to be unioned with query_expression_2. For details, see “Syntax for query_expression” on page 1148. |
ALL |
that duplicate rows are to be retained for the UNION. |
query_expression_2 |
a complete SELECT statement to be unioned with query_expression_1. For details, see “Syntax for query_factor” on page 1147. |
ANSI Compliance
This is ANSI SQL:2011 compliant.
Valid UNION Operations
Besides simple queries, UNION can be used within the following operations:
Note: You cannot use the HASH BY or LOCAL ORDER BY clauses in derived tables with set operators.
UNION ALL is the only valid set operator in a WITH RECURSIVE clause or CREATE RECURSIVE VIEW statement that defines a recursive query.
Unsupported Operations
UNION cannot be used within the following:
Description of a UNION Operation
Each query connected by UNION is performed to produce a result consisting of a set of rows. The union must include the same number of columns from each table in each SELECT statement (more formally, they must be of the same degree), and the data types of these columns should be compatible. All the result sets are then combined into a single result set that has the data type of the columns specified in the first SELECT statement in the union. For an example, see “Example 6: Effect of the Order of SELECT Statements on Data Type” on page 1174.
UNION and NULL
When you specify an explicit NULL for any union operation, its data type is INTEGER. For an example, see “Example 5: Effect of Explicit NULLs on Data Type of a UNION” on page 1173.
On the other hand, column data defined as NULL has neither value nor data type and evaluates like any other null in a scalar expression.
Duplicate Rows
Unless the ALL option is used, duplicate rows are eliminated from each result set and from the final result.
If the ALL option is used, duplicate rows are retained for the applicable result set.
You can specify the ALL option for each UNION operator in the query to retain every occurrence of duplicate rows in the final result.
Unexpected Row Length Errors: Sorting Rows for UNION
Before performing the sort operation used to check for duplicates in some union operations, Teradata Database creates a sort key and appends it to the rows to be sorted. If the length of this temporary data structure exceeds the system limit of 64 KB, the operation fails and returns an error to the requestor. Depending on the situation, the message text is one of the following:.
See Messages for explanations of these messages.
Example 1
To select the name, project, and the number of hours spent by employees assigned to project OE1‑0001, plus the names of employees not assigned to a project, the following query could be used:
SELECT Name, Proj_Id, Hours
FROM Employee,Charges
WHERE Employee.Empno = Charges.Empno
AND Proj_Id IN ('OE1‑0001')
UNION
SELECT Name, NULL (CHAR (8)), NULL (DECIMAL (4,2))
FROM Employee
WHERE Empno NOT IN
(SELECT Empno
FROM Charges);
This query returns the following rows:
Name |
Project Id |
Hours |
Aguilar J |
? |
? |
Brandle B |
? |
? |
Chin M |
? |
|
Clements D |
? |
? |
Kemper R |
|
|
Marston A |
? |
? |
Phan A |
? |
? |
Regan R |
? |
? |
Russell S |
? |
? |
Smith T |
|
|
Watson L |
|
|
Inglis C |
0E1-0001 |
30.0 |
Inglis C |
0E1-001 |
30.5 |
Leidner P |
0E1-001 |
10.5 |
Leidner P |
0E1-001 |
23.0 |
Moffit H |
0E1-001 |
12.0 |
Moffit H |
0E1-001 |
35.5 |
In this example, null expressions are used in columns 2 and 3 of the second SELECT statement. The null expressions are used as place markers so that both SELECT statements in the query contain the same number of expressions.
Example 2
To determine the department number and names of all employees in departments 500 and 600, the UNION operator could be used as follows:
SELECT DeptNo, Name
FROM Employee
WHERE DeptNo = 500
UNION
SELECT DeptNo, Name
FROM Employee
WHERE DeptNo = 600 ;
This query returns the following rows:
DeptNo |
Name |
500 |
Carter J |
500 |
Inglis C |
500 |
Marston A |
500 |
Omura H |
500 |
Reed C |
500 |
Smith T |
500 |
Watson L |
600 |
Aguilar J |
600 |
Kemper R |
600 |
Newman P |
600 |
Regan R |
The same results could have been returned with a simpler query, such as the following:
SELECT Name, DeptNo
FROM Employee
WHERE (DeptNo = 500)
OR (DeptNo = 600);
The advantage to formulating the query using the UNION operator is that if the DeptNo column is the primary index for the Employee table, then using the UNION operator guarantees that the basic selects are prime key operations. There is no guarantee that a query using the OR operation will make use of the primary index.
Example 3
In addition, the UNION operator is useful if you must merge lists of values taken from two or more tables.
For example, if departments 500 and 600 had their own Employee tables, the following query could be used to select data from two different tables and merge that data into a single list:
SELECT Name, DeptNo
FROM Employee_dept_500
UNION
SELECT Name, DeptNo
FROM Employee_dept_600 ;
Example 4
Suppose you want to know the number of man-hours charged by each employee who is working on a project. In addition, suppose you also wanted the result to include the names of employees who are not working on a project.
To do this, you would have to perform a union operation as illustrated in the following example.
SELECT Name, Proj_Id, Hours
FROM Employee, Charges
WHERE Employee.EmpNo = Charges.EmpNo
UNION
SELECT Name, Null (CHAR(8)), Null (DECIMAL(4,2)),
FROM Employee
WHERE EmpNo NOT IN
(SELECT EmpNo
FROM Charges
)
UNION
SELECT Null (VARCHAR(12)), Proj_Id, Hours
FROM Charges
WHERE EmpNo NOT IN
(SELECT EmpNo
FROM Employee
);
The first portion of the statement joins the Employee table with the Charges table on the EmpNo column. The second portion accounts for the employees who might be listed in the Employee table, but not the Charges table. The third portion of the statement accounts for the employees who might be listed in the Charges table and not in the Employee table. This ensures that all the information asked for is included in the response.
UNION Operator and the Outer Join
“Example 4” on page 1172 does not illustrate an outer join. That operation returns all rows in the joined tables for which there is a match on the join condition and rows from the “left” join table, or the “right” join table, or both tables for which there is no match. Moreover, non-matching rows are extended with NULLs.
It is possible, however, to achieve an outer join using inner joins and the UNION operator, though the union of any two inner joins is not the equivalent of an outer join.
The following example shows how to achieve an outer join using two inner joins and the UNION operator. Notice how the second inner join uses NULLs.
SELECT Offering.CourseNo, Offerings.Location, Enrollment.EmpNo
FROM Offerings, Enrollment
WHERE Offerings.CourseNo = Enrollment.CourseNo
UNION
SELECT Offerings.CourseNo, Offerings.Location, NULL
FROM Offerings, Enrollment
WHERE Offerings.CourseNo <> Enrollment.CourseNo;
The above UNION operation returns results equivalent to the results of the left outer join example shown above.
O.CourseNo |
O.Location |
E.EmpNo |
C100 |
El Segundo |
235 |
C100 |
El Segundo |
668 |
C200 |
Dayton |
? |
C400 |
El Segundo |
? |
Example 5: Effect of Explicit NULLs on Data Type of a UNION
Set operator results evaluate to the data type of the columns defined in the first SELECT statement in the operation. When a column in the first SELECT is defined as an explicit NULL, the data type of the result is not intuitive.
Consider the following two examples, which you might intuitively think would evaluate to the same result but do not.
In the first, an explicit NULL is selected as a column value.
SELECT 'p', NULL
FROM TableVM
UNION
SELECT 'q', 145.87
FROM TableVM;
BTEQ returns the result as follows.
'p' Null
--- -----------
p ?
q 145
The expected value for the second row of the Null column probably differs from what you might expect—a decimal value of 145.87.
What if the order of the two SELECTs in the union is reversed?
SELECT 'q', 145.87
FROM TableVM
UNION
SELECT 'p', NULL
FROM TableVM;
BTEQ returns the result as follows.
'q' 145.87
--- -----------
p ?
q 145.87
The value for q is now reported as its true data type—DECIMAL—and without truncation. Why the difference?
In the first union example, the explicit NULL is specified for the second column in the first SELECT statement. The second column in the second SELECT statement, though specified as a DECIMAL number, evaluates to an integer because in this context, NULL, though having no value, does have the data type INTEGER, and that type is retained for the result of the union.
The second union example carries the data type for the value 145.87—DECIMAL—through to the result.
You can confirm the unconverted data type for NULL and 145.87 by performing the following SELECT statement.
SELECT TYPE(NULL), TYPE(145.87)
BTEQ returns the result as follows.
Type(Null) Type(145.87)
----------------- ----------------------
INTEGER DECIMAL(5,2)
Example 6: Effect of the Order of SELECT Statements on Data Type
The result of any UNION is always expressed using the data type of the selected value of the first SELECT. This means that SELECT A UNION SELECT B does not always return the same result as SELECT B UNION SELECT A unless you explicitly convert the output data type to ensure the same result in either case.
Consider the following complex unioned queries:
SELECT MIN(X8.i1)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1=X1.i1
AND X8.i1 IN
(SELECT COUNT(*)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1=X1.i1
AND X8.i1 = ANY
(SELECT COUNT(*)
FROM t7 X7
WHERE X7.i1 = ANY
(SELECT AVG(X1.i1)
FROM t1 X1)))
UNION
SELECT AVG(X4.i1)
FROM t4 X4
WHERE X4.i1 = ANY
(SELECT (X8.i1)
FROM t1 X1
RIGHT JOIN t8 X8 ON X8.i1=X1.i1
AND X8.i1 = IN
(SELECT MAX(X8.i1)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1=X1.i1
AND
(SELECT (X4.i1)
FROM t6 X6
RIGHT JOIN t4 X4 ON X6.i1=i1))));
The result is the following report.
Minimum(i1)
-----------
-2
0
You might intuitively expect that reversing the order of the queries on either side of the UNION would produce the same result. Because the data types of the selected value of the first SELECT can differ, this is not always true, as the following query on the same database demonstrates.
SELECT AVG(X4.i1)
FROM t4 X4
WHERE X4.i1 = ANY
(SELECT (X8.i1)
FROM t1 X1
RIGHT JOIN t8 X8 ON X8.i1 = X1.i1
AND X8.i1 = ANY
(SELECT MAX(X8.i1)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1 = X1.i1
AND
(SELECT (X4.i1)
FROM t6 X6
RIGHT JOIN t4 X4 ON X6.i1 = i
)
)
)
UNION
SELECT MIN(X8.i1)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1 = X1.i1
AND X8.i1 IN
(SELECT COUNT(*)
FROM t8 X8
LEFT JOIN t1 X1 ON X8.i1 = X1.i1
AND X8.i1 = ANY
(SELECT COUNT(*)
FROM t7 X7
WHERE X7.i1 = ANY
(SELECT AVG(X1.i1)
FROM t1 X1
)
);
The result is the following report.
Average(i1)
-----------
-2
1
The actual average is < 0.5. Why the difference when the order of SELECTs in the UNION is reversed? The following table explains the seemingly paradoxical results.
WHEN the first SELECT specifies this function … |
The result data type is … |
AND the value returned as the result is … |
AVG |
REAL |
1 |
MIN |
INTEGER |
truncated to 0 |