Properties of Relations and Their Logical Manipulation | Teradata Vantage - Properties of Relations and Their Logical Manipulation - Advanced SQL Engine - Teradata Database

Database Design

Product
Advanced SQL Engine
Teradata Database
Release Number
17.10
Published
July 2021
Language
English (United States)
Last Update
2021-07-27
dita:mapPath
kko1591750222108.ditamap
dita:ditavalPath
kko1591750222108.ditaval
dita:id
B035-1094
lifecycle
previous
Product Category
Teradata Vantage™

Definitions

Term Definition
Alternate key Any candidate key not selected to be the primary key for a relation.
Attribute A property of a relation that describes its primary key. Each attribute has a unique name, is drawn from a domain, and can be constrained in various ways.

Because its values are all drawn from the same domain, the data for any attribute is homogeneous by definition.

Attribute is the term used in set theory and logical design. Column is the equivalent term used in physical design and database management.

Body The body of a relation is the composite value set assigned to its tuple variables.

Each SQL relation must have a body (see Types of Missing Values about "Table_Dee" and "Table_Dum" for why this might not be an optimal situation).

Candidate key An attribute set that uniquely identifies a tuple.
A candidate key has the following minimal properties:
  • The value of the key uniquely identifies the tuple in which it appears.
  • Attributes defining the candidate key cannot be redundant. If an attribute is removed from the candidate key, then its uniqueness must be destroyed else it is not a properly defined candidate key. Compare with Superkey.

In the completed physical design of a relational database, candidate keys are easy to identify because they are always constrained as UNIQUE NOT NULL.

Composite key A key defined on more than one attribute. Compare with Simple key.
Domain The set of all possible values that can be specified for a given attribute.

The physical representation of a domain is a data type, and the ideal representation of a domain is a distinct user-defined data type (see Teradata Vantage™ - SQL Data Definition Language Syntax and Examples, B035-1144 and Teradata Vantage™ - SQL External Routine Programming, B035-1147 for more information about UDTs).

Field The intersection of a tuple and an attribute.

Columns in a table are often referred to as fields, but strictly speaking, that is incorrect.

Foreign key An attribute set based on an identical attribute set that acts as a candidate key (typically the primary key) for a different relation.

Foreign keys reflect relationships between tables and are often used as join columns.

Heading Each attribute of a relation must have a heading. Each such attribute has two required parts: a name and a domain, or data type. It is common practice not to call out the domain of an attribute unless it is germane to the problem at hand, but that does not render the typing of each attribute in a relation any less necessary.
Instance A tuple drawn from the complete set of tuples for a relation. The term is sometimes used to describe any selected set of tuples from a relation.
Intelligent key An overloaded simple key that encodes more than one fact (a combination of identification as well as characterization facts.), which is a violation of 1NF. The principal implementation problem with intelligent keys is that if any of the components of the key change, then all applications that access the key are affected. This is why you should always select unchanging attributes for your keys.

The classic example of an intelligent key is the International Standard Book Number (ISBN) used by publishers to identify individual books. Each ISBN is composed of a group identifier, a publisher identifier, a title identifier, and a check digit.

Key An attribute set that uniquely identifies each tuple in a relation. Implicitly synonymous with primary key, though it applies equally well to any candidate or foreign key. See Primary key.
Natural key The representation of a real world tuple identifier in a relational database. For example, a common identifier of employees in a corporation is a unique employee number. An employee is assigned an employee number whether that information is stored within the database or not.

Natural keys are sometimes confused with intelligent keys (see Intelligent key), but they are very different concepts.

Order independence The logical ordering of tuples and attributes has no meaning. In other words:
  • Tuples have no essential up-down order.
  • Attributes have no essential right-left order.

When attributes are manifested physically as columns, their order does have significance for SQL.

Primary key The primary key for a relation is an attribute set that uniquely identifies each tuple in the relation. By the entity integrity rule, primary keys cannot contain nulls. See Rules for Primary Keys for an explanation of the entity integrity rule.

Every relation must have one and only one primary key. More current thinking argues that every relation must have a candidate key, but that it need not necessarily be declared as the primary key for the relation.

Note that you cannot use XML, BLOB or CLOB columns to define a physical key or other database constraint (see Designing for Database Integrity), nor can you use XML, BLOB, or CLOB columns to define the physical primary key for a global temporary trace table. See CREATE GLOBAL TEMPORARY TRACE TABLE in Teradata Vantage™ - SQL Data Definition Language Detailed Topics, B035-1184

The primary key is often used as the primary index for a table when a relation is manifested physically (see Indexes and Maps and Primary Index, Primary AMP Index, and NoPI Objects).

Relation A two-dimensional representation of data in tabular form. Note that database relations are n-dimensional, not 2-dimensional as is commonly asserted.

Relation is the term used in set theory and logical design. Table is the analogous term used in physical design and database management.

Relational schema A set of relations in a logical relational model. In the physical model, a relational schema is manifested as a database.
Repeating group A collection of logically related attributes that occur more than once in a tuple.
Simple key A key defined on a single attribute. Compare with Composite key.
Superkey Any set of attributes that uniquely identifies a tuple, whether redundantly or not. The allowance of redundant attributes within a super key distinguishes it from a simple candidate key (see Candidate key).
Surrogate key An artificial simple key used to identify individual entities when there is no natural key or when the situation demands a noncomposite key, but no natural noncomposite key exists.

Surrogate keys do not identify individual entities in a meaningful way: they are simply an arbitrary method to distinguish among them. You should only resort to surrogate keys if there is no other way to uniquely identify the rows of a table.

Surrogate keys are typically arbitrary system-generated sequential integers. See CREATE TABLE in Teradata Vantage™ - SQL Data Definition Language Detailed Topics, B035-1184 for information about how to generate surrogate keys in Teradata.

Tuple A unique instance of a relation consisting of, at minimum, a primary key and zero or more attributes that describe the primary key.

Tuple is the term used in set theory and logical design. Row is the equivalent term used in physical design and database management. Nonrelational systems use the term record in the same way relational systems use row.

Uniqueness Duplicate rows are not permitted.

When relations are manifested physically as tables in SQL databases, duplicate rows are permitted for multiset tables only. In fact, the ANSI/ISO SQL standard is bag (multiset)-oriented rather than set-oriented by definition. This is at odds with one of the fundamental properties of the relational model.

The term duplicate row is used here as a row whose columnar values match every columnar value of one or more other rows in a relation.

This is not a recommended practice: you should always either define your tables as set tables to avoid the many problems that multiset tables present or define them as multiset tables, but specify at least one of the columns to be UNIQUE NOT NULL.

Logical Operations on Relations

Relations are formally decomposed and constructed using logical relational operators and the relational algebra. The relational algebra is a set of procedural constructs for manipulating relations, while the relational calculus is a set of nonprocedural constructs for performing the same operations, SQL is a combination of both the algebra and the calculus.

The following table lists set theory operators.

Set Theory Operators

Logical Operator Description
DIFFERENCE X - Y

The set of all attributes contained in relation X but not in relation Y.

INTERSECTION X Y

The set of all attributes contained in both relation X and relation Y.

PRODUCT X Y

The set of all multiples of all attributes contained in relations X and Y.

UNION X Y

The set of all attributes contained in either relation X or relation Y or both.

Special Relational Database Operators

Logical Operator Description
DIVIDE The division of relation R of degree m + n by relation S of degree n produces a quotient relation of degree m.
JOIN The join of relation R on attribute X with relation S on attribute Y is the set of all tuples t such that the concatenation of a tuple r, belonging to R, and a tuple s, belonging to S, and the predicate r.R equality_operator s.S evaluates to TRUE.

In SQL, this is expressed in an ON clause in a DML join statement.

PROJECT The projection of relation R on attributes X, Y, …, n is the set of all tuples (x, y, …, n) such that a tuple n appears in R with X value x, Y value y, and so on.

Less formally, a projection on relation R is any subset of the attributes of R.

In SQL, this is expressed as a column list in a DML statement.

RESTRICT/SELECT The restriction of relation R is the set of all tuples t such that the comparison t.X operator t.Y evaluates to TRUE.

Less formally, a restriction (or selection) on relation R is any subset of the tuples of R satisfying the condition X equality_operator Y.

In SQL, this is expressed in a WHERE clause in a DML statement.