15.00 - UNION Operator - Teradata Database

Teradata Database SQL Functions, Operators, Expressions, and Predicates

Product
Teradata Database
Release Number
15.00
Content Type
Programming Reference
Publication ID
B035-1145-015K
Language
English (United States)

UNION Operator

Purpose  

Combines two or more SELECT results tables into a single result.

Syntax  

where:

 

Syntax element …

Specifies …

query_expression_1

a complete SELECT statement to be unioned with query_expression_2.

For details, see “Syntax for query_expression” on page 1148.

ALL

that duplicate rows are to be retained for the UNION.

query_expression_2

a complete SELECT statement to be unioned with query_expression_1.

For details, see “Syntax for query_factor” on page 1147.

ANSI Compliance

This is ANSI SQL:2011 compliant.

Valid UNION Operations

Besides simple queries, UNION can be used within the following operations:

  • Derived tables
  • Note: You cannot use the HASH BY or LOCAL ORDER BY clauses in derived tables with set operators.

  • Subqueries
  • INSERT … SELECT statements
  • Non-recursive CREATE VIEW statements
  • UNION ALL is the only valid set operator in a WITH RECURSIVE clause or CREATE RECURSIVE VIEW statement that defines a recursive query.

    Unsupported Operations

    UNION cannot be used within the following:

  • SELECT AND CONSUME statements.
  • WITH RECURSIVE clause (unless the ALL option is also specified)
  • CREATE RECURSIVE VIEW statements (unless the ALL option is also specified)
  • Description of a UNION Operation

    Each query connected by UNION is performed to produce a result consisting of a set of rows. The union must include the same number of columns from each table in each SELECT statement (more formally, they must be of the same degree), and the data types of these columns should be compatible. All the result sets are then combined into a single result set that has the data type of the columns specified in the first SELECT statement in the union. For an example, see “Example 6: Effect of the Order of SELECT Statements on Data Type” on page 1174.

    UNION and NULL

    When you specify an explicit NULL for any union operation, its data type is INTEGER. For an example, see “Example 5: Effect of Explicit NULLs on Data Type of a UNION” on page 1173.

    On the other hand, column data defined as NULL has neither value nor data type and evaluates like any other null in a scalar expression.

    Duplicate Rows

    Unless the ALL option is used, duplicate rows are eliminated from each result set and from the final result.

    If the ALL option is used, duplicate rows are retained for the applicable result set.

    You can specify the ALL option for each UNION operator in the query to retain every occurrence of duplicate rows in the final result.

    Unexpected Row Length Errors: Sorting Rows for UNION

    Before performing the sort operation used to check for duplicates in some union operations, Teradata Database creates a sort key and appends it to the rows to be sorted. If the length of this temporary data structure exceeds the system limit of 64 KB, the operation fails and returns an error to the requestor. Depending on the situation, the message text is one of the following:.

  • A data row is too long.
  • Maximum row length exceeded in database_object_name.
  • See Messages for explanations of these messages.

    Example  1

    To select the name, project, and the number of hours spent by employees assigned to project OE1‑0001, plus the names of employees not assigned to a project, the following query could be used:

       SELECT Name, Proj_Id, Hours
       FROM Employee,Charges 
       WHERE Employee.Empno = Charges.Empno 
       AND Proj_Id IN ('OE1‑0001')
       UNION
       SELECT Name, NULL (CHAR (8)), NULL (DECIMAL (4,2)) 
       FROM Employee 
       WHERE Empno NOT IN
       (SELECT Empno 
       FROM Charges);

    This query returns the following rows:

     

    Name

    Project Id

    Hours

    Aguilar J

    ?

    ?

    Brandle B

    ?

    ?

    Chin M

    ?

     

    Clements D

    ?

    ?

    Kemper R

     

     

    Marston A

    ?

    ?

    Phan A

    ?

    ?

    Regan R

    ?

    ?

    Russell S

    ?

    ?

    Smith T

     

     

    Watson L

     

     

    Inglis C

    0E1-0001

    30.0

    Inglis C

    0E1-001

    30.5

    Leidner P

    0E1-001

    10.5

    Leidner P

    0E1-001

    23.0

    Moffit H

    0E1-001

    12.0

    Moffit H

    0E1-001

    35.5

    In this example, null expressions are used in columns 2 and 3 of the second SELECT statement. The null expressions are used as place markers so that both SELECT statements in the query contain the same number of expressions.

    Example  2

    To determine the department number and names of all employees in departments 500 and 600, the UNION operator could be used as follows:

       SELECT DeptNo, Name 
       FROM Employee 
       WHERE DeptNo = 500
       UNION
       SELECT DeptNo, Name 
       FROM Employee 
       WHERE DeptNo = 600 ; 

    This query returns the following rows:

     

    DeptNo

    Name

    500

    Carter J

    500

    Inglis C

    500

    Marston A

    500

    Omura H

    500

    Reed C

    500

    Smith T

    500

    Watson L

    600

    Aguilar J

    600

    Kemper R

    600

    Newman P

    600

    Regan R

    The same results could have been returned with a simpler query, such as the following:

       SELECT Name, DeptNo 
       FROM Employee 
       WHERE (DeptNo = 500) 
       OR (DeptNo = 600); 

    The advantage to formulating the query using the UNION operator is that if the DeptNo column is the primary index for the Employee table, then using the UNION operator guarantees that the basic selects are prime key operations. There is no guarantee that a query using the OR operation will make use of the primary index.

    Example  3

    In addition, the UNION operator is useful if you must merge lists of values taken from two or more tables.

    For example, if departments 500 and 600 had their own Employee tables, the following query could be used to select data from two different tables and merge that data into a single list:

       SELECT Name, DeptNo 
       FROM Employee_dept_500 
       UNION 
       SELECT Name, DeptNo 
       FROM Employee_dept_600 ; 

    Example  4

    Suppose you want to know the number of man-hours charged by each employee who is working on a project. In addition, suppose you also wanted the result to include the names of employees who are not working on a project.

    To do this, you would have to perform a union operation as illustrated in the following example.

       SELECT Name, Proj_Id, Hours 
       FROM Employee, Charges 
       WHERE Employee.EmpNo = Charges.EmpNo
       UNION
       SELECT Name, Null (CHAR(8)), Null (DECIMAL(4,2)), 
       FROM Employee 
       WHERE EmpNo NOT IN 
       (SELECT EmpNo 
       FROM Charges
       )
       UNION
       SELECT Null (VARCHAR(12)), Proj_Id, Hours 
       FROM Charges 
       WHERE EmpNo NOT IN 
       (SELECT EmpNo 
       FROM Employee
       );

    The first portion of the statement joins the Employee table with the Charges table on the EmpNo column. The second portion accounts for the employees who might be listed in the Employee table, but not the Charges table. The third portion of the statement accounts for the employees who might be listed in the Charges table and not in the Employee table. This ensures that all the information asked for is included in the response.

    UNION Operator and the Outer Join

    “Example 4” on page 1172 does not illustrate an outer join. That operation returns all rows in the joined tables for which there is a match on the join condition and rows from the “left” join table, or the “right” join table, or both tables for which there is no match. Moreover, non-matching rows are extended with NULLs.

    It is possible, however, to achieve an outer join using inner joins and the UNION operator, though the union of any two inner joins is not the equivalent of an outer join.

    The following example shows how to achieve an outer join using two inner joins and the UNION operator. Notice how the second inner join uses NULLs.

       SELECT Offering.CourseNo, Offerings.Location, Enrollment.EmpNo
       FROM Offerings, Enrollment
       WHERE Offerings.CourseNo = Enrollment.CourseNo
       UNION
       SELECT Offerings.CourseNo, Offerings.Location, NULL
       FROM Offerings, Enrollment
       WHERE Offerings.CourseNo <> Enrollment.CourseNo;

    The above UNION operation returns results equivalent to the results of the left outer join example shown above.

     

    O.CourseNo

    O.Location

    E.EmpNo

    C100

    El Segundo

    235

    C100

    El Segundo

    668

    C200

    Dayton

    ?

    C400

    El Segundo

    ?

    Example  5: Effect of Explicit NULLs on Data Type of a UNION

    Set operator results evaluate to the data type of the columns defined in the first SELECT statement in the operation. When a column in the first SELECT is defined as an explicit NULL, the data type of the result is not intuitive.

    Consider the following two examples, which you might intuitively think would evaluate to the same result but do not.

    In the first, an explicit NULL is selected as a column value.

       SELECT 'p', NULL
       FROM TableVM
       UNION
       SELECT 'q', 145.87
       FROM TableVM;

    BTEQ returns the result as follows.

       'p'         Null
       ---  -----------
       p              ?
       q            145

    The expected value for the second row of the Null column probably differs from what you might expect—a decimal value of 145.87.

    What if the order of the two SELECTs in the union is reversed?

       SELECT 'q', 145.87
       FROM TableVM
       UNION
       SELECT 'p', NULL
       FROM TableVM;

    BTEQ returns the result as follows.

       'q'       145.87
       ---  -----------
       p              ?
       q         145.87

    The value for q is now reported as its true data type—DECIMAL—and without truncation. Why the difference?

    In the first union example, the explicit NULL is specified for the second column in the first SELECT statement. The second column in the second SELECT statement, though specified as a DECIMAL number, evaluates to an integer because in this context, NULL, though having no value, does have the data type INTEGER, and that type is retained for the result of the union.

    The second union example carries the data type for the value 145.87—DECIMAL—through to the result.

    You can confirm the unconverted data type for NULL and 145.87 by performing the following SELECT statement.

       SELECT TYPE(NULL), TYPE(145.87)

    BTEQ returns the result as follows.

       Type(Null)         Type(145.87)
       -----------------  ----------------------
       INTEGER            DECIMAL(5,2)

    Example  6: Effect of the Order of SELECT Statements on Data Type

    The result of any UNION is always expressed using the data type of the selected value of the first SELECT. This means that SELECT A UNION SELECT B does not always return the same result as SELECT B UNION SELECT A unless you explicitly convert the output data type to ensure the same result in either case.

    Consider the following complex unioned queries:

       SELECT MIN(X8.i1)
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1=X1.i1
       AND X8.i1 IN
       (SELECT COUNT(*) 
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1=X1.i1
       AND X8.i1 = ANY
       (SELECT COUNT(*)
       FROM t7 X7
       WHERE X7.i1 = ANY
       (SELECT AVG(X1.i1)
       FROM t1 X1)))
       UNION
       SELECT AVG(X4.i1)
       FROM t4 X4
       WHERE X4.i1 = ANY
       (SELECT (X8.i1)
       FROM t1 X1
       RIGHT JOIN t8 X8 ON X8.i1=X1.i1
       AND X8.i1 = IN
       (SELECT MAX(X8.i1)
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1=X1.i1
       AND
       (SELECT (X4.i1)
       FROM t6 X6
       RIGHT JOIN t4 X4 ON X6.i1=i1))));
       

    The result is the following report.

       Minimum(i1)
       -----------
                -2
                 0

    You might intuitively expect that reversing the order of the queries on either side of the UNION would produce the same result. Because the data types of the selected value of the first SELECT can differ, this is not always true, as the following query on the same database demonstrates.

       SELECT AVG(X4.i1)
       FROM t4 X4
       WHERE X4.i1 = ANY
       (SELECT (X8.i1)
       FROM t1 X1
       RIGHT JOIN t8 X8 ON X8.i1 = X1.i1
       AND X8.i1 = ANY
       (SELECT MAX(X8.i1)
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1 = X1.i1
       AND
       (SELECT (X4.i1)
       FROM t6 X6
       RIGHT JOIN t4 X4 ON X6.i1 = i
       )
       )
       )
       UNION
       SELECT MIN(X8.i1)
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1 = X1.i1
       AND X8.i1 IN
       (SELECT COUNT(*) 
       FROM t8 X8
       LEFT JOIN t1 X1 ON X8.i1 = X1.i1
       AND X8.i1 = ANY
       (SELECT COUNT(*)
       FROM t7 X7
       WHERE X7.i1 = ANY
       (SELECT AVG(X1.i1)
       FROM t1 X1
       )
       );

    The result is the following report.

       Average(i1)
       -----------
                -2
                 1

    The actual average is < 0.5. Why the difference when the order of SELECTs in the UNION is reversed? The following table explains the seemingly paradoxical results.

     

    WHEN the first SELECT specifies this function …

    The result data type is …

    AND the value returned as the result is …

    AVG

    REAL

    1

    MIN

    INTEGER

    truncated to 0