Interpolator Arguments - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software
TimeColumn
Specifies the name of the input_table column that contains the time points of the time series whose missing values are to be calculated.
TimeInterval
[Optional] Specify exactly one of time_table or TimeInterval.

Specifies the length of time, in seconds, between calculated values. This value must be either INTEGER or DOUBLE PRECISION.

ValueColumns
Specifies the names of input_table columns to interpolate to the output table.
TimeDataType
[Optional] Specifies the data type of the output column that corresponds to the input table column that TimeColumn specifies (time_column).

If you omit this argument, the function infers the data type of time_column from the input table and uses the inferred data type for the corresponding output table column.

If you specify this argument, the function can transform the input data to the specified output data type only if both the input column data type and the specified output column data type are in this list:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)
ValueDataType
[Optional] Specifies the data types of the output columns that correspond to the input table columns that ValueColumns specifies.

If you omit this argument, the function infers the data type of each time_column from the input table and uses the inferred data type for the corresponding output table column.

If you specify ValueDataType, it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, ValueDataType must specify n data types. For i in [1, n], value_column_i has value_type_i. However, value_type_i can be empty; for example:

ValueColumns (c1, c2, c3) ValueDataType (INTEGER, ,VARCHAR)
If you specify this argument, the function can transform the input data to the specified output data type only if both the input column data type and the specified output column data type are in this list:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)
InterpolationType
[Optional] Specifies interpolation types for the columns that ValueColumns specifies.

If you specify InterpolationType, it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, InterpolationType must specify n interpolation types. For i in [1, n], value_column_i has interpolation_type_i. However, interpolation_type_i can be empty; for example:

ValueColumns (c1, c2, c3) InterpolationType ('linear', ,'constant]')

An empty interpolation_type has the default value.

The possible values of interpolation_type are as follows:

In interpolation_type syntax, brackets do not indicate optional elements—you must include them.
  • 'linear' (Default):

    The value for each missing data point is determined using linear interpolation between the two nearest points.

  • 'constant':

    The only interpolation type supported if value_column has data type CHARACTER, CHARACTER(n), CHARACTER VARYING, CHARACTER VARYING(n), or VARCHAR. The value for each missing data point is set to the nearest value.

  • 'spline[(type(cubic))]]':

    The value for each missing data point is determined by fitting a cubic spline to the nearest three points.

  • 'median[(window( n ))]':

    n must be greater than or equal to 2. Default: n is 5. The value for each missing data point is set to the median value of the nearest four points.

  • 'loess[(weights({constant|tricube}), degree ({0|1|2}), span( m ))]':
    • weights: Default: constant.
    • degree: Default: 1.
    • m is either an integer greater than 1 (which specifies the number of neighboring points) or a real number between (λ+1)/n and 1 (λ is the degree of the local polynomial and n is the number of data points). Default: m is 5.

The value for each missing data point is a low-degree polynomial based on a set of nearest neighbors. The fitting can be weighted so that points closer to the missing data have more influence than points farther away.

Your choice of TimeInterval or time_table affects interpolation:
  • If you specify TimeInterval, the function calculates the value for the time point only if the value is missing; otherwise, the function copies the original value.
  • If you specify time_table, the function always calculates the value of the time point.
Specify only one of InterpolationType or AggregationType. If you omit both arguments, the function uses InterpolationType with its default value, 'linear'.
AggregationType
[Optional] Specifies the aggregation types of the columns that ValueColumns specifies.

If you specify AggregationType, it must be the same size as ValueColumns. That is, if ValueColumns specifies n columns, AggregationType must specify n aggregation types. For i in [1, n], value_column_i has aggregation_type_i. However, aggregation_type_i can be empty; for example:

ValueColumns (c1, c2, c3)AggregationType (min, ,max)

An empty aggregation_type has the default value.

The syntax of aggregation_type is:

{ min | max | mean | mode | sum } [(window(n))]
In aggregation_type syntax, brackets do not indicate optional elements—you must include them.

The function calculates the aggregate value as the minimum, maximum, mean, mode, or sum within a sliding window of length n. n must be greater than or equal to 2. Default: n is 5 and the default aggregation method is min.

The Interpolator function can calculate the aggregates of values of these data types:
  • INTEGER
  • BIGINT
  • SMALLINT
  • DOUBLE PRECISION
  • DECIMAL(n,n)
  • DECIMAL
  • NUMERIC
  • NUMERIC(n,n)

Your choice of TimeInterval or time_table does not affect aggregation. The function always calculates the aggregated value.

Specify only one of AggregationType or InterpolationType. If you omit both arguments, the function uses InterpolationType with its default value, 'linear'.
StartTime
[Optional] Specifies the start time for the time series. Default: Start time of the time series in input_table.
EndTime
[Optional] Specifies the end time for the time series. Default: End time of the time series in input_table.
ValuesBeforeFirst
[Optional] Specifies the values to use if start_time is before the start time of the time series in input_table. Each of these values must have the same data type as its corresponding value_column. Values of data type VARCHAR are case-insensitive.

If ValueColumns specifies n columns, ValuesBeforeFirst must specify n values. For i in [1, n], value_column_i has the value before_first_value_i. However, before_first_value_i can be empty; for example:

ValueColumns (c1, c2, c3) ValuesBeforeFirst (1, ,'abc')

If before_first_value_i is empty, value_column_i has the value NULL.

Default: value_column_i has the value NULL for i in [1, n].

ValuesAfterLast
[Optional] Specifies the values to use if end_time is after the end time of the time series in input_table. Each of these values must have the same data type as its corresponding value_column. Values of data type VARCHAR are case-insensitive.

If ValueColumns specifies n columns, ValuesAfterLast must specify n values. For i in [1, n], value_column_i has the value after_last_value_i. However, after_last_value_i can be empty; for example:

ValueColumns (c1, c2, c3) ValuesAfterLast (1, ,'abc')

If after_last_value_i is empty, value_column_i has the value NULL.

Default: value_column_i has the value NULL for i in [1, n].

DuplicateRowsCount
[Optional] Specifies the number of rows to duplicate across split boundaries if you use the function SeriesSplitter (Example 2: Using SeriesSplitter with Interpolator shows how to use Interpolator with SeriesSplitter).

If you specify this argument but do not use SeriesSplitter, or do not conform to the conditions that apply for the value for each interpolation or aggregation type, the function either issues an error message or produces incorrect results.

If you specify only value1, the function duplicates value1 rows from the previous partition and value1 rows from the next partition. If you specify both value1 and value2, the function duplicates value1 rows from the previous partition and value2 rows from the next partition. Each argument value must be nonnegative INTEGER.

Both value1 and value2 must exceed the number of data points that the function needs for every specified interpolation or aggregation method. The interpolation methods and the number of data points that the function needs for them are:
  • 'linear' and 'constant': 1
  • 'spline': 2
  • 'median [(window(n))]': n/2
  • 'loess [(weights ({constant|tricube}), degree ({0|1|2}), span(m))]':

    If m > 1: m-1

    If m < 1: (m * n)-1, where n is total number of data rows, found in column n of the count_row_number table

Accumulate
[Optional] Specifies the names of input_table columns (other than those specified by TimeColumn and ValueColumns) to copy to the output table. Default: Only the columns specified by TimeColumn and ValueColumns.