Definition of the Outer Join - Teradata Database

SQL Data Manipulation Language

Product

Teradata Database

Release Number

15.10

Language

English (United States)

Last Update

2018-10-06

dita:id

B035-1146

lifecycle

Product Category

Teradata® Database

The outer join is an extension of the inner join. The difference between outer joins and inner joins is the existence of an nonmatching row set in the outer join result.

Outer joins of two or more tables perform an inner join of those tables according to a specified join condition and also return rows from the left join table, the right join table, or both, that do not match the inner join condition, extending the results rows with nulls in the nonmatching fields.

The nonmatching rows are for those rows in the outer table that do not have a match in the inner table, while the matching rows in the outer join result are the rows that satisfy the join condition, which are exactly the same as those in the inner join result.

The outer join is loosely defined as the algebraic UNION ALL of its components. Note that because UNION ALL permits duplicates, it is not a relational operator in the strict sense of the term.

where:

Syntax element …	Specifies …
left_table	the table reference that appears to the left of the join type keywords.
right_table	the table reference that appears to the right of the join type keywords.
join_condition	the columns on which the join is made separated by the comparison operator that specifies the comparison type for the join. The join conditions of an ON clause define the rows in the left table that take part in the match to the right table. At least one join condition is required in the ON clause for each table in the outer join. You can include multiple join conditions in an ON clause by using the Boolean AND, OR, and NOT operators.

For example, consider the following SELECT statement:

     SELECT offerings.course_no, offerings.location, enrollment.emp_no

     FROM offerings

     LEFT OUTER JOIN enrollment

     ON offerings.course_no = employee.course_no;

where:

This variable …	Corresponds to this syntax element …
offerings	left_table
enrollment	right_table
offerings.course_no=employee.course_no	join_condition

You cannot specify a SAMPLE clause in a subquery used as an ON clause predicate.

Assume you have the data presented in tables t1 and t2:

t1		t2
FK		PK
x1	y1	x2	y2
1	1	1	1
2	2	3	3
3	3	4	4
4	4	5	5

An inner join of these tables is specified by the following query:

     SELECT x1, y1, x2, y2

     FROM t1, t2

     WHERE x1=x2;

x1 --	y1 --	x2 --	y2 --
1	1	1	1
3	3	3	3
4	4	4	4

An left outer join of these tables is specified by the following query:

     SELECT x1, y1, x2, y2

     FROM t1 LEFT OUTER JOIN t2 ON x1=x2;

The result of this query is as follows:

x1 --	y1 --	x2 --	y2 --
1	1	1	1
2	2	?	?
3	3	3	3
4	4	4	4

In the original table definitions, x2 is the primary key of t2. See “Definition of the Outer Join” on page 236. Assume the following referential integrity constraint is defined on t1:

     FOREIGN KEY (x1) REFERENCES t2 (x2);

This constraint requires each of the x1 (foreign key) values in t1 to also exist as the primary key for one, and only one, row in table t2 as column x2. Therefore the second row in t1 cannot be valid, which makes the nonmatching row in the outer join result disappear from the outcome. See “Simple Outer Join Example” on page 237. The outer join result converts into the inner join result on the same tables because the referential integrity constraint guarantees there can be no unmatched rows in the join result.

Mathematically, the foreign key‑primary key relationship implies that the values in the foreign key column set form a proper subset of the set of values in the primary key column set, expressed mathematically as S_FK⊂ S_PK, where S_FK represents the foreign key value set and S_PK represents the primary key value set.

Therefore, an equijoin between the parent table primary key column set and a child table foreign key column set always leaves the unmatched row set empty so that an outer join is equivalent to the corresponding inner join. This property is extremely valuable for query optimization using join indexes. For more information, see SQL Data Definition Language and Database Design.

Refer to the following table abstractions and Venn diagram.

This diagram is identical to one “Components of an Inner Join” on page 230; however, the way the components of this diagram are combined is very different.

Define table_A as the row set contained by sections 1 and 2. Refer to table_A as left_table.

Define table_B as the row set contained by sections 1 and 3. Refer to table_B as right_table.

Define the components of a 2-table outer join as follows:

Component	Define as …
1	the inner join of the 2 tables as defined by an ON clause with all join conditions applied. These rows also provide the matching rows of an outer join.
2	all rows from left_table not included in section 1 extended with nulls for each nonmatching column from left_table. These are the unmatched rows in a left outer join.
3	all rows from right_table not included in section 1 extended with nulls for each nonmatching column selected from right_table. These are the unmatched rows in a right outer join.

The sections to be unioned are defined by the type of outer join:

This outer join type …	Is defined as …
LEFT	Section 1 `UNION ALL` Section 2
RIGHT	Section 1 `UNION ALL` Section 3
FULL	Section 1 `UNION ALL` Section 2 `UNION ALL` Section 3

For each type of outer join, you can think of the operation as assembling the proper components using the UNION ALL operator. Note that because UNION ALL permits duplicates, it is not a relational operator in the strict sense of the term.

The terms inner table and outer table, used frequently when writing about outer joins, are defined in the following table:

Term

Definition

Inner table

The inner table is the table that contributes only matched rows to the outer join result.

For an inner join, both tables are inner tables.

Outer table

The outer table is the table that contributes unmatched rows to the outer join result.

In this description, the word unmatched refers to the rows in the left or right (or both) table that are not part of the inner join rows because they have no matching columns, so they are extended with nulls in the results table.

This terminology also applies to the result of nested joins and spools.

Consider the following nested outer join:

     (table_A

     LEFT OUTER JOIN

     (table_B

     RIGHT OUTER JOIN table_C ON join_condition

     ON join_condition

     FULL OUTER JOIN table_D ON join_condition

What are the inner and outer tables of this outer join?

Beginning with the most deeply nested join and working outward, these relationships apply:

1 table_c is an outer table with respect to table_b

2 table_a is an outer table with respect to the nested join (table_b RIGHT OUTER JOIN table_c ON join_condition)

3 (table_a LEFT OUTER JOIN (table_b RIGHT OUTER JOIN table_c ON join_condition) ON join_condition is an outer table for the full outer join

4 table_d is an outer table for the full outer join

When you construct an outer join, choose the projected column list carefully to ensure the results are useful and interpretable. In general, you should project the column from the same side as the outer join. “Practical Example of a Right Outer Join” on page 250 right outer joins on CourseNo. Therefore, project out the right join column, which is Enrollment.CourseNo.

Similarly, in the full outer join in “Practical Example of a Full Outer Join” on page 252, the example projects both CourseNo columns for this same reason.

Nulls are a fundamental component of the report produced by a query that contains an outer join. The key feature of the outer join is that in returning rows from the outer table set, the report extends the rows that have no matching values with nulls, as if these unknown “values” came from the corresponding table.

For example, if you want to list courses offered by customer education for which employees have registered and also to include in that list those courses for which no one signed up.

The practical outer join examples in “Left Outer Join” on page 247, “Right Outer Join” on page 249, and “Full Outer Join” on page 251 use these three tables.

offerings shows customer education courses currently being offered and their location. With respect to courses being offered, you can think of offerings as a subset of courses.

offerings
course_no	beginning_dates	location
C100	01/05/2006	El Segundo
C200	07/02/2006	Dayton
C400	10/07/2006	El Segundo

enrollment shows employees who have registered for courses, some of which may not be offered.

enrollment
emp_no	course_no
236	C100
236	C300

courses lists all courses developed by customer education, some of which are not currently being offered (for example, C300).

courses
course_no	name
C100	Recovery Planning
C200	Software Architecture
C300	Teradata Basics
C400	Introduction to Java Programming

Note that the nulls reported for an outer table set do not represent missing information, but the empty set, which is not the same thing. For a description of the various uses of nulls in the SQL language, see Database Design.

The following SELECT statement is an example of a left outer join of two tables, table_a and table_b:

     SELECT column_expression

     FROM table_a

     LEFT OUTER JOIN table_b ON join_conditions

     WHERE join_condition_exclusions;

The left table in the above example means the table specified to the left of the keywords OUTER JOIN in the FROM clause. The right table means the table specified to the right of the keywords OUTER JOIN.

The keyword LEFT indicates the source of the nonmatching rows returned in the result of the left outer join.

In addition to performing an inner join of two or more tables according to a join condition, a left outer join, as in the example above, returns nonmatching rows from its left table (table_a) and extends them with nulls.

A right outer join returns nonmatching rows from its right table and extends them with nulls.

A full outer join returns nonmatching rows from both of the join tables and extends them with nulls.

The reserved word OUTER is optional so that the preceding SELECT statement could also be written as follows:

     SELECT ...

     FROM table_A

     LEFT JOIN table_B ON join_condition;

You can specify a scalar subquery as an operand of a scalar predicate in the ON clause of an outer join specification.

The following example specifies a scalar subquery (SELECT AVG(price)…) in its ON clause.

     SELECT category, title, COUNT(*)

     FROM movie_titles AS t2 LEFT OUTER JOIN transactions AS txn

     ON (SELECT AVG(price)

         FROM movie_titles AS t1

         WHERE t1.category = t2.category)<(SELECT AVG(price)

                                           FROM movie_titles)

         AND t2.title = txn.title;

The following rules apply to the use of the DEFAULT function as part of the search condition within an outer join ON clause:

The DEFAULT function takes a single argument that identifies a relation column by name. The function evaluates to a value equal to the current default value for the column. For cases where the default value of the column is specified as a current built‑in system function, the DEFAULT function evaluates to the current value of system variables at the time the request is executed.

The resulting data type of the DEFAULT function is the data type of the constant or built-in function specified as the default unless the default is NULL. If the default is NULL, the resulting date type of the DEFAULT function is the same as the data type of the column or expression for which the default is being requested.

The DEFAULT function has two forms. It can be specified as DEFAULT or DEFAULT (column_name). When no column name is specified, the system derives the column based on context. If the column context cannot be derived, an error is returned to the requestor.

You can specify a DEFAULT function with a column name argument within a predicate. The system evaluates the DEFAULT function to the default value of the column specified as its argument. Once the system has evaluated the DEFAULT function, it treats it like a constant in the predicate.

You can specify a DEFAULT function without a column name argument within a predicate only if there is one column specification and one DEFAULT function as the terms on each side of the comparison operator within the expression.

Following existing comparison rules, a condition with a DEFAULT function used with comparison operators other than IS NULL or IS NOT NULL is unknown if the DEFAULT function evaluates to null.

A condition other than IS NULL or IS NOT NULL with a DEFAULT function compared with a null evaluates to unknown.

IF a DEFAULT function is used with …	THEN the comparison is …
IS NULL	TRUE if the default is null Else it is FALSE
IS NOT NULL	FALSE if the default is null Else it is TRUE

For more information about the DEFAULT function, see SQL Functions, Operators, Expressions, and Predicates.

The examples assume the following table definitions:

     CREATE TABLE table15 (

       col1 INTEGER ,

       col2 INTEGER NOT NULL,

       col3 INTEGER NOT NULL DEFAULT NULL,

       col4 INTEGER CHECK (col4 > 10) DEFAULT 9 );

     CREATE TABLE table16 (

       col1 INTEGER,

       col2 INTEGER DEFAULT 10,

       col3 INTEGER DEFAULT 20,

       col4 CHARACTER(60)  );

You can specify a DEFAULT function as a component of the search condition for an outer join. The following examples demonstrate proper use of the DEFAULT function as part of an outer join search condition.

In the following example, the DEFAULT function evaluates to the default value of table16.col2, which is 10.

     SELECT *

     FROM table16

     FULL OUTER JOIN table15 ON table15.col1 < DEFAULT(table16.col2);

Therefore, the previous example is equivalent to the following example:

     SELECT *

     FROM table16

     FULL OUTER JOIN table15 ON table15.col1 < 10;