Usage Notes - Teradata Database

SQL Data Manipulation Language

Product
Teradata Database
Release Number
16.10
Published
June 2017
Language
English (United States)
Last Update
2018-04-25
dita:mapPath
psg1480972718197.ditamap
dita:ditavalPath
changebar_rev_16_10_exclude_audience_ie.ditaval
dita:id
B035-1146
lifecycle
previous
Product Category
Teradata® Database

Time Series, Dense Representations, and Sparse Representations of Temporal Data

A time series is an ordered sequence of measurements of a variable that are arranged according to the time of their occurrence. Time series are typically measured at some constant frequency and their data points are generally, but not necessarily, spaced at uniform time intervals.

The characteristic properties of a time series include the following.
  • The data points are not independent of one another.
  • The dispersion of data points varies as a function of time.
  • The data frequently indicates trends.
  • The data tends to be cyclic.
Typical business applications for time series analysis include the following.
  • Budgetary analysis
  • Economic forecasting
  • Inventory analysis
  • Process control
  • Quality control
  • Sales forecasting
  • Stock market analysis
  • Workload projections
  • Yield projections

The EXPAND ON clause enables several different forms of time series expansion on a PERIOD column value of an input row by producing a set of value-equivalent rows, one for each granule in the specified time period. The number of granules is defined by the anchor name you specify for the clause.

You can expand sparse PERIOD representations of relational data into a dense representation of the same data. Data converted to a dense form can be more easily manipulated by complex analyses such as moving average calculations without having to write complex SQL requests to respond to business questions made against sparse relational data.

The available forms of time series expansion for the EXPAND ON clause are the following.
  • Interval expansion, where rows are expanded by user-specified intervals.
  • Anchor point expansion, where rows are expanded by user-specified anchored points.
  • Anchor PERIOD expansion, where rows are expanded by user-specified anchored periods.

Rules and Restrictions for the EXPAND ON Clause

The rules and restrictions are:

  • There are three different ways to expand rows:
    • By user-specified intervals such as 1 DAY, 3 MONTH, and so on.

      You must specify an interval literal in the EXPAND ON clause to execute this form of expansion. This type of expansion is called an interval expansion.

      An interval expansion might be useful for answering queries such as “compute the moving window average of inventory cost by week during the year 2010."

    • By user-specified anchored points in a time line.

      You must specify an anchored interval without specifying a PERIOD keyword to execute this form of expansion. Such expansion is called an anchor point expansion.

      An anchor point expansion produces a specific date or time point in the expanded rows, where the date or time points are the anchor points present in the input period being expanded.

      For example, BY ANCHOR MONTH_BEGIN returns multiple expanded rows, one for each beginning value of a month present in the input period. This form of expansion might be useful for answering queries such as “get the month end average inventory cost during the last quarter of the year 2010.”

    • By user-specified anchored time durations in a time line.

      You must specify an anchored interval with the PERIOD keyword to execute this form of expansion.

      Such an expansion is called as anchor period expansion.

      An anchor PERIOD expansion produces rows whose period beginning bounds are always aligned to specific DateTime values derived from the anchor names.

      For example, BY ANCHOR PERIOD MONTH_BEGIN, BY ANCHOR PERIOD MONDAY, and so on.

      The PERIOD value of the row along with fixed durations enables weighted computations such as “compute the weekly weighted average of inventory cost.”

  • You can use the EXPAND ON clause in query expressions, except for:
    • Subqueries in search conditions. See “Specifying Subqueries in Search Conditions”.

      For example, the EXPAND ON clause in the following statement (which assumes the existence of a column named pd1 in employee) is not valid:

         SELECT *
         FROM employee
         WHERE salary IN (SELECT salary
                         FROM salary_table
                          WHERE salary > 10000
                          EXPAND ON pd1);
    • A WITH clause is specified in the query expression. See “WITH Clause”.
    • An updatable cursor SELECT statement. See SQL Stored Procedures and Embedded SQL .
    • A CREATE JOIN INDEX statement. See “CREATE JOIN INDEX” in SQL Data Definition Language .
    • The query expression is a SELECT AND CONSUME statement. See “SELECT AND CONSUME”.
  • The expand expression can be a column or column alias from the select list or it can be a number that specifies the positional sequence of the column or expression in the select list that is to be expanded.

    If the specified column sequential position is not valid, for example, if it is greater than the projected number of columns, Teradata Database returns an error.

    The specified column must have a data type of PERIOD.

    You must specify a table reference in a SELECT statement with an EXPAND ON clause.

  • You can specify an EXPAND ON clause in any of the following places:
    • Within a derived table.

      This includes cases where the derived table is contained within a subquery.

      See Example: Expansion Over a UNION Operator.

    • As part of the individual query expression of a SELECT statement that specifies set operators such as UNION, INTERSECT, or MINUS/EXCEPT. See SQL Functions, Operators, Expressions, and Predicates .

      See Example: Expansion Over a UNION Operator.

    • As part of the SELECT specification in the following SQL DDL statements. See CREATE RECURSIVE VIEW/REPLACE RECURSIVE VIEW in the seed and the recursive statements of the view definition, CREATE TABLE … AS, and CREATE VIEW/REPLACE VIEW in SQL Data Definition Language .

      Note that a view that contains an EXPAND ON clause in its definition is not updatable.

    • As part of the SELECT specification in an INSERT … SELECT statement. See INSERT/INSERT … SELECT.
  • You cannot specify an EXPAND ON clause in the following situations:
    • Anywhere in a SELECT AND CONSUME statement (see “SELECT AND CONSUME”).
    • Anywhere in an updatable cursor (see SQL Stored Procedures and Embedded SQL ).
    • Anywhere in a subquery used as a search condition.
    • Anywhere in a SELECT statement that specifies a TOP n operator (see “TOP n”).

      You can work around this restriction by specifying the TOP n operator within a derived table and then specifying the EXPAND ON clause in the outer query.

    • Anywhere in a SELECT statement that specifies a SAMPLE clause. See SAMPLE Clause.
    • Anywhere in a SELECT statement that specifies a WITH clause (see “WITH Clause”).
    • Anywhere in a table function.
    • As part of the SELECT specification in a CREATE JOIN INDEX statement (see “CREATE JOIN INDEX” in SQL Data Definition Language ).
  • All of the operations in a query expression except ORDER BY are performed before any rows are expanded.

    The ORDER BY operation is then performed on the expanded rows if you specify an expanded column in the ORDER BY clause.

  • If you do not specify a BY expansion_interval clause, the expanding interval defaults to the granularity of the element type of the PERIOD value for the column.

    The expansion defaults to the rules for interval literals described in Rules and Restrictions for Interval Expansion.

  • To limit the number of rows being expanded, specify a PERIOD expression in the FOR clause, where the PERIOD expression represents the period of interest.

    The data type of the PERIOD expression must be compatible with the PERIOD data type of expanded_column_name.

    If the specified period is not null and does not overlap with the row, then the row is not eligible for expansion.

  • Teradata Database expands rows based on the interval you specify.

    The interval can either be an interval constant, or an anchored interval derived from the anchor name such as any day of a week, the MONTH_BEGIN keyword, or the MONTH_END keyword.

  • An EXPAND ON clause specification produces one value-equivalent row for each time granule derived from the specified expansion_interval. Rows are said to be value-equivalent if the values in all columns except for their expanded columns are identical.
  • If you specify a FOR clause, the expansion period is the overlapping period of the qualified row and the PERIOD constant you specify in the FOR clause. Otherwise, the expanding period is the PERIOD value of the selected row.
  • If the expanding PERIOD expression specifies a column from another table that is not specified in the FROM clause, then Teradata Database joins the tables before the expansion.

    You must ensure that the appropriate join condition is specified in such cases. If no other join conditions are specified, then the system performs a Product Join on the referenced tables.

    See Example: Join Before Expansion.

  • When you do not specify an expanded column in the select list, but do specify the DISTINCT operator, the EXPAND operation is nullified.

    See Example: Nullified EXPAND Operation.

  • If the expanding PERIOD is null, then the expanded row has a null expansion.

    See Example: Null Expansion Period Producing a Null Expanded Value.

Rules and Restrictions for Interval Expansion

The rules and restrictions for the use of Interval Expansions are as follows:
  • An expansion commences from the beginning value of the expansion period and ends at its ending value with an incremental value set to the specified interval literal for each row to be expanded.

    In this process, the last expanded row might not always cover the expansion interval duration. If this occurs, the result is a partial period, which is a period having a duration that is less than the expansion interval.

  • If one or more rows in the expanded result has an expansion period duration that is less than the specified interval, and the interval being expanded also has a higher granularity than the element type of the period being expanded, Teradata Database returns a warning to the requestor. See Example: Creating a Time Series Using Expansion By an Interval Constant Value.
  • Unlike anchor period expansions, Teradata Database does not use a system-defined business calendar set for an interval literal in an EXPAND ON clause.

Anchor Period and Anchor Point Expansion

The rules and restriction for the use of Anchor PERIOD and Anchor Point Expansions are:

  • If the data type of the expansion expression is PERIOD(DATE), and you specify a time literal in the anchored interval, the request returns an error to the requestor because you cannot specify a time option when the data type of the expansion expression is PERIOD(DATE).
  • You cannot specify an anchored interval if the data type of the expansion expression is either PERIOD(TIME) or PERIOD(TIME WITH TIME ZONE); otherwise, the request aborts and Teradata Database returns an error to the requestor because the PERIOD(TIME) and PERIOD(TIME WITH TIME ZONE) data types are not valid for an anchored EXPAND ON clause.
  • If the element type of the expansion expression is a Timestamp data type, Teradata Database uses the time literal value that you specify to define the timestamp value of the anchor during expansion.

    If you do not specify a time literal, the time literal value defaults to '00:00:00.000000+00:00' for an anchor name of ANCHOR_MILLISECOND, ANCHOR_SECOND, ANCHOR_MINUTE, ANCHOR_HOUR, WEEK_BEGIN, MONTH_BEGIN, QUARTER_BEGIN, or YEAR_BEGIN and to '23:59:59.999999+00:00' for any other anchor name.

    The precision of the default value is set to the precision of the expansion expression. The default value includes the time zone value of +00:00 if the expansion expression specifies a time zone; otherwise, the value is the session time zone and does not specify a time zone.

    The anchor, for example MONTH_BEGIN, is computed based on the session time zone. Thus, for two sessions that are at different time zones, the output will probably be different. For an example of this, see Example: Same Expansion in Two Different Sessions in Different Time Zones.

  • If you specify the anchor as a DAY, such as anchor_name is DAY, the expansion interval is INTERVAL '1' DAY for each expanded row.
  • If a table is to be expanded by the first calendar day of every month, you must specify MONTH_BEGIN in the anchored interval clause.

    Each expanded value for the row in the result has the first day of the corresponding month as the BEGIN bound and the expansion interval defaults to INTERVAL '1' MONTH. The BEGIN bound of the expanded value for each result row is 'YYYY-MM-01'.

    See the first SELECT request in Example: EXPAND ON MONTH_BEGIN and MONTH_END.

  • If a table is to be expanded by the last calendar day of every month, you must specify MONTH_END in the anchored interval clause.

    Each expanded value for the result row has the last day of the corresponding month as its BEGIN bound, and the expansion interval defaults to INTERVAL '1' MONTH.

    The BEGIN bound of the expanded value for each result row is 'YYYY-MM-DD' where DD is one of 28, 29, 30, or 31, depending on the month and the year.

    See the second SELECT request in Example: EXPAND ON MONTH_BEGIN and MONTH_END.

  • If a table is to be expanded by a particular day of every week, you must specify the desired week day in the anchored interval clause. The begin bound of each expanded value for the result row corresponds to that day of the week, and the expanding interval defaults to INTERVAL '7' DAY.

    See Example: Expansion on an Anchor Point Using WEEK_DAY.

Rules and Restrictions for Anchor Point Expansion of the EXPAND ON Clause

The following cases are possible with Anchor Point Expansion when the expansion interval is longer than the granularity of the element type of the expand expression:
  • There is no single anchor point in the expansion period.

    For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2007-08-14', DATE '2007-08-17'), then the expansion period starts on Tuesday and ends on Friday in the same week.

    In this case, the input row does not produce any expanded rows in the expanded result.

  • The beginning bound of the expansion period is not aligned to an anchor point and its duration is longer than the expansion interval.

    In this case, the expanded row set does not have a row corresponding to the beginning bound of the expansion period.

    For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2007-08-15', DATE '2007-08-25'), the expanded result contains only one point, which is DATE '2007-08-20'.

  • The beginning bound of the expansion period is aligned with an anchor point.

    In such a case, the expanded rows contain a row with the beginning bound of the expansion period.

    For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2007-08-20', DATE '2007-08-25'), then the last expanded row contains DATE '2007-08-20'.

The expanded rows correspond to all anchor points (the BEGIN point of an anchor period) that are within the expansion period. This also means that each row expands to as many anchor points that exist in the expansion period.

Anchor Period Expansion of the EXPAND ON Clause

The rules and restrictions for the use of Anchor Period Expansion of EXPAND ON are:

  • The expanded rows correspond to all anchor periods that overlap the expansion period. The key difference between anchor period expansion and anchor point expansion is that for anchor point expansion, the anchor point, which is the beginning bound of an anchor period, must occur within the expansion period, while for anchor period expansions, the anchor period must overlap the expansion period.
  • The following three cases are the only valid applications of anchor period expansions.
    • The expansion period has a duration that is less than that of the expansion interval.

      For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2011-08-14', DATE '2011-08-17'), then the expansion period starts on Tuesday, ends on Friday of the same week, and the expanded row is a singleton with a resulting anchor period of PERIOD(DATE '2011-08-13', DATE'2011-08-20').

      In this case, there is only one expanded row for the input row.

    • The expansion period is not aligned with an anchor period, but the duration of the expansion period is greater than the expansion interval.

      Such a row produces more than one expanded row in its result.

      For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2011-08-15', DATE '2011-08-25').

      In this case, the expanded result has two anchor periods, PERIOD(DATE '2011-08-13', DATE '2011-08-20') and PERIOD(DATE '2011-08-20', DATE '2011-08-27').

    • The expanding period is aligned with an anchor period and its duration is an exact multiple of the expansion interval.

      For example, if the expansion is by MONDAY and the expansion period is PERIOD(DATE '2011-08-20', DATE '2011-08-27'), then the expanded row contains PERIOD(DATE '2011-08-20', DATE '2011-08-27').

  • Teradata Database uses a system-defined business calendar set in the session for the anchored interval in an EXPAND ON clause.

    In an anchor expansion, Teradata Database derives the BEGIN bound of the first expanded row from the business calendar set in the current session. The BEGIN bound of the first expanded row is derived from the YearBeginDate, WeekStart, and CalendarPeriod columns from the BusinessCalendarPattern table.

    For example, a row has period from 1-Jan-2008 to 31-May-2008, YearBeginDate is 15-Mar-2008, and CalendarPeriod is from 1-Jan-08 to 30-Oct-08. Expanding by anchor point MONTH_BEGIN results in 4 rows, where the first row is from 15-Jan-08 to 15-Feb-08 and second row is from 15-Feb-08 to 15-Mar-08 and 3rd row is from 15-Mar-08 to 15-Apr08 and 4th rows is from 15-Apr-08 to 15-May-08. The MONTH_BEGIN value is derived from the YearBeginDate value, which is January 15 and, because the Calendar begins in January, the first month starts on January 15.

  • This table lists the EXPAND ON clause anchor names.
             Anchor Name            Expanding Interval     Default Time Literal               EXPAND Result
ANCHOR_MILLISECOND INTERVAL ‘1’ MILLISECOND 00:00:00.000000+00:00 Each expanded value for the row in the result adjusts the anchor to the nearest millisecond boundary.
ANCHOR_SECOND INTERVAL ‘1’ SECOND 00:00:00.000000+00:00 Each expanded value for the row in the result adjusts the anchor to the nearest second boundary.
ANCHOR_MINUTE INTERVAL ‘1’ MINUTE 00:00:00.000000+00:00 Each expanded value for the row in the result adjusts the anchor to the nearest minute boundary.
ANCHOR_HOUR INTERVAL ‘1’ HOUR 00:00:00.000000+00:00 Each expanded value for the row in the result adjusts the anchor to the nearest hour boundary.
WEEK_BEGIN INTERVAL ‘7’ DAY 00:00:00.000000+00:00 Each expanded value for the row in the result has the first day of the corresponding week as its BEGIN bound.
WEEK_END 23:59:59.999999+00:00 Each expanded value for the row in the result has the last day of the corresponding week as its BEGIN bound.
MONTH_BEGIN INTERVAL ‘1’ MONTH 00:00:00.000000+00:00 Each expanded value for the row in the result has the first day of the corresponding month as its BEGIN bound.
MONTH_END 23:59:59.999999+00:00 Each expanded value for the row in the result has the last day of the corresponding month as its BEGIN bound.
QUARTER_BEGIN INTERVAL ‘3’ MONTH 00:00:00.000000+00:00 Each expanded value for the row in the result has the first day of the corresponding quarter as its BEGIN bound.
QUARTER_END 23:59:59.999999+00:00 Each expanded value for the row in the result has the last day of the corresponding quarter as its BEGIN bound.
YEAR_BEGIN INTERVAL ‘1’ YEAR 00:00:00.000000+00:00 Each expanded value for the row in the result has the first day of the corresponding year as its BEGIN bound.
YEAR_END 23:59:59.999999+00:00 Each expanded value for the row in the result has the last day of the corresponding year as its BEGIN bound.
  • The WEEK_BEGIN anchor name corresponds to an anchor expansion by WeekStart.

    For example, if the business calendar for the session is ISO and you specify expansion by WEEK_BEGIN, Teradata Database translates the WEEK_BEGIN expansion to EXPAND ON ….BY ANCHOR MONDAY because the week begins on Monday.

  • The following anchors may give different results for the ISO calendar than they give for the Teradata or COMPATIBLE calendars.
    • MONTH_BEGIN
    • MONTH_END
    • QUARTER_BEGIN
    • QUARTER_END
    • YEAR_BEGIN
    • YEAR_END

      For information on how the ISO calendar computes results, see the section named “About ISO Computation” in SQL Functions, Operators, Expressions, and Predicates.

  • Expand does not consider exceptions even for a first date.

    For example, if you want to execute an expansion by business days, you must perform a normal expansion and then apply business functions on all of the expanded rows using a derived table to get the desired expansion by business days.

Expand Interval Literal Expressions

The following table explains the meanings of the expand interval syntax variables for the EXPAND ON clause.

Expand Interval Description
interval_literal To perform an interval expansion, specify any valid Interval literal value (see SQL Data Types and Literals for a comprehensive list of valid Interval literals).
ANCHOR anchor_name To perform an anchor point expansion, specify an anchor name, but do not specify PERIOD.
ANCHOR PERIOD anchor_name To perform an anchor period expansion, you must specify both PERIOD and an anchor name.

Anchor Name Literal Expressions

The following table lists the ANCHOR anchor_name variables for the EXPAND ON clause.

Anchor Name Description
ANCHOR_MILLISECOND Produces multiple expanded rows, one for each millisecond in the input period.
ANCHOR_SECOND Produces multiple expanded rows, one for each second in the input period.
ANCHOR_MINUTE Produces multiple expanded rows, one for each minute in the input period.
ANCHOR_HOUR Produces multiple expanded rows, one for each hour in the input period.
DAY The expansion interval is INTERVAL '1' DAY for each expanded row.
WEEK_BEGIN Produces multiple expanded rows, one for each beginning value of a week present in the input period.
WEEK_END Produces multiple expanded rows, one for each ending value of a week present in the input period.
MONTH_BEGIN Produces multiple expanded rows, one for each beginning value of a month present in the input period.

An example query that uses such expansion is “Get the month end average inventory cost during the last quarter of the year 2010."

MONTH_END Produces multiple expanded rows, one for each ending value of a month present in the input period.
QUARTER_BEGIN Produces multiple expanded rows, one for each beginning value of a quarter present in the input period.
QUARTER_END Produces multiple expanded rows, one for each ending value of a quarter present in the input period.
YEAR_BEGIN Produces multiple expanded rows, one for each beginning value of a year present in the input period.
YEAR_END Produces multiple expanded rows, one for each ending value of a year present in the input period.
MONDAY An anchor period expansion produces rows whose period beginning bounds are always aligned to specific DateTime values derived from the anchor names.

For this specification, the period beginning bound is the first Monday in the specified period.

TUESDAY The period beginning bound is the first Tuesday in the specified period.
WEDNESDAY The period beginning bound is the first Wednesday in the specified period.
THURSDAY The period beginning bound is the first Thursday in the specified period.
FRIDAY The period beginning bound is the first Friday in the specified period.
SATURDAY The period beginning bound is the first Saturday in the specified period.
SUNDAY The period beginning bound is the first Sunday in the specified period.