Arguments - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software
Argument Category Description
TimeColumn Required Specifies the name of the input_table column that contains the time points of the time series whose missing values are to be calculated.
TimeInterval Optional Specifies the length of time, in seconds, between calculated values. This value must be either INTEGER or DOUBLE PRECISION.
Specify exactly one of time_table or TimeInterval.
ValueColumns Required Specifies the names of input_table columns to interpolate to the output table.
TimeDataType Optional Specifies the data type of the output column that corresponds to the input table column that TimeColumn specifies (time_column).

If you omit this argument, then the function infers the data type of time_column from the input table and uses the inferred data type for the corresponding output table column.

If you specify this argument, then the function can transform the input data to the specified output data type only if both the input column data type and the specified output column data type are in this list:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)
ValueDataType Optional Specifies the data types of the output columns that correspond to the input table columns that ValueColumns specifies.

If you omit this argument, then the function infers the data type of each time_column from the input table and uses the inferred data type for the corresponding output table column.

If you specify ValueDataType, then it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, then ValueDataType must specify n data types. For i in [1, n], value_column_i has value_type_i. However, value_type_i can be empty; for example:

ValueColumns (c1, c2, c3) ValueDataType (INTEGER, ,VARCHAR)

If you specify this argument, then the function can transform the input data to the specified output data type only if both the input column data type and the specified output column data type are in this list:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)
InterpolationType Optional Specifies interpolation types for the columns that ValueColumns specifies.

If you specify InterpolationType, then it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, then InterpolationType must specify n interpolation types. For i in [1, n], value_column_i has interpolation_type_i. However, interpolation_type_i can be empty; for example:

ValueColumns (c1, c2, c3) InterpolationType ('linear', ,'constant]')

An empty interpolation_type has the default value.

The possible values of interpolation_type are as follows:

In interpolation_type syntax, brackets do not indicate optional elements—you must include them.
  • 'linear' (default): The value for each missing data point is determined using linear interpolation between the two nearest points.
  • 'constant': The only interpolation type supported if value_column has data type CHARACTER, CHARACTER(n), CHARACTER VARYING, CHARACTER VARYING(n), or VARCHAR. The value for each missing data point is set to the nearest value.
  • 'spline[(type(cubic))]]': The value for each missing data point is determined by fitting a cubic spline to the nearest three points.

    'median[(window( n ))]': n must be greater than or equal to 2. The default value of n is 5. The value for each missing data point is set to the median value of the nearest four points.

  • 'loess[(weights({constant | tricube}), degree ({0 |1 |2}), span( m ))]':
    • weights: The default value is constant.
    • degree: The default value is 1.
    • m is either an integer greater than 1 (which specifies the number of neighboring points) or a real number between (λ+1)/n and 1 (λ is the degree of the local polynomial and n is the number of data points). The default value of m is 5.

The value for each missing data point is a low-degree polynomial based on a set of nearest neighbors. The fitting can be weighted so that points closer to the missing data have more influence than points farther away.

Your choice of TimeInterval or time_table affects interpolation:
  • If you specify TimeInterval, then the function calculates the value for the time point only if the value is missing; otherwise, the function copies the original value.
  • If you specify time_table, then the function always calculates the value of the time point.
Specify only one of InterpolationType or AggregationType. If you omit both arguments, the function uses InterpolationType with its default value, 'linear'.
AggregationType Optional Specifies the aggregation types of the columns that ValueColumns specifies.

If you specify AggregationType, then it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, then AggregationType must specify n aggregation types. For i in [1, n], value_column_i has aggregation_type_i. However, aggregation_type_i can be empty; for example:

ValueColumns (c1, c2, c3)AggregationType (min, ,max)

An empty aggregation_type has the default value.

The syntax of aggregation_type is:

{ min | max | mean | mode | sum } [(window(n))]

In aggregation_type syntax, brackets do not indicate optional elements—you must include them.

The function calculates the aggregate value as the minimum, maximum, mean, mode, or sum within a sliding window of length n. n must be greater than or equal to 2. The default value of n is 5. The default aggregation method is min.

The Interpolator function can calculate the aggregates of values of these data types:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)

Your choice of TimeInterval or time_table does not affect aggregation. The function always calculates the aggregated value.

Specify only one of AggregationType or InterpolationType. If you omit both arguments, the function uses InterpolationType with its default value, 'linear'.
StartTime Optional Specifies the start time for the time series. The default value is the start time of the time series in input_table.
EndTime Optional Specifies the end time for the time series. The default value is the end time of the time series in input_table.
ValuesBeforeFirst Optional Specifies the values to use if start_time is before the start time of the time series in input_table. Each of these values must have the same data type as its corresponding value_column. Values of data type VARCHAR are case-insensitive.

If ValueColumns specifies n columns, then ValuesBeforeFirst must specify n values. For i in [1, n], value_column_i has the value before_first_value_i. However, before_first_value_i can be empty; for example:

ValueColumns (c1, c2, c3) ValuesBeforeFirst (1, ,'abc')

If before_first_value_i is empty, then value_column_i has the value NULL. If you do not specify Values_Before_First, then value_column_i has the value NULL for i in [1, n].

ValuesAfterLast Optional Specifies the values to use if end_time is after the end time of the time series in input_table. Each of these values must have the same data type as its corresponding value_column. Values of data type VARCHAR are case-insensitive.

If ValueColumns specifies n columns, then ValuesAfterLast must specify n values. For i in [1, n], value_column_i has the value after_last_value_i. However, after_last_value_i can be empty; for example:

ValueColumns (c1, c2, c3) ValuesAfterLast (1, ,'abc')

If after_last_value_i is empty, then value_column_i has the value NULL. If you do not specify ValuesAfterLast, then value_column_i has the value NULL for i in [1, n].

DuplicateRowsCount Optional Specifies the number of rows to duplicate across split boundaries if you use the function SeriesSplitter (Example 2: Using SeriesSplitter with Interpolator shows how to use Interpolator with SeriesSplitter).

If you specify this argument but do not use SeriesSplitter, or do not conform to the conditions that apply for the value for each interpolation or aggregation type, then the function either issues an error message or produces incorrect results.

If you specify only value1, then the function duplicates value1 rows from the previous partition and value1 rows from the next partition. If you specify both value1 and value2, then the function duplicates value1 rows from the previous partition and value2 rows from the next partition. Each argument value must be nonnegative INTEGER.

Both value1 and value2 must exceed the number of data points that the function needs for every specified interpolation or aggregation method. The interpolation methods and the number of data points that the function needs for them are:
  • 'linear' and 'constant': 1
  • 'spline': 2
  • 'median [(window(n))]': n/2
  • 'loess [(weights ({constant | tricube}), degree ({0 |1 |2}), span(m))]':

    If m > 1: m-1

    If m < 1: (m * n)-1, where n is total number of data rows, found in column n of the count_row_number table

Accumulate Optional Specifies the names of input_table columns (other than those specified by TimeColumn and ValueColumns) to copy to the output table. By default, the function copies to the output table only the columns specified by TimeColumn and ValueColumns.