Null Replacement - Teradata Warehouse Miner

In-Database Analytic Functions User Guide

Product
Teradata Warehouse Miner
Release Number
5.4.2
Published
October 2016
Language
English (United States)
Last Update
2018-05-04
dita:mapPath
zyl1473786378775.ditamap
dita:ditavalPath
ft:empty
dita:id
B035-2306
lifecycle
previous
Product Category
Teradata® Warehouse Miner

Purpose

NULL value replacement is offered as a transformation function. A literal value, the mean, median, mode or an imputed value joined from another table can be used as the replacement value. The median value can be requested with or without averaging of two middle values when there is an even number of values.

Literal value replacement is supported for numeric, character and date data types. Mean value replacement is supported for columns of numeric type or date type. Median without averaging, mode and imputed value replacement are valid for any supported type. Median with averaging is however supported only for numeric and date type.

Syntax

call twm. td_analyze('vartran','database=twm_source;tablename=twm_customer;keycolumns=cust_id;General Parameters;nullreplacement={nullstyle (imputed,twm_customer_analysis),columns (income)};');

Required Parameters

columns
Controls the name of the output (transformed) column and its data type. The columns parameter is required by all transformations except Derive. A separate transformation is performed for each column in the list. If a column name is followed by a forward slash and a name, the name after the slash becomes the name of the transformed column in the resultant output table. Otherwise the column name is used as the output column name.
For the Derive transformation, the outputname parameter controls the naming of the transformed output column.
database
The database containing the input table.
keycolumns
When null replacement is requested, either via a Null Replacement transformation or in combination with a Bin Code, Derive, Design Code, Recode, Rescale, Sigmoid or Z Score transformation, the keycolumns parameter must be specified. The column or columns listed must form a unique key into the input and output table of the transformation.
nullreplacement
The parameter that identifies the type of transformation being performed.
nullstyle
Data types supported by various nullstyle parameters are:
Data Type Description Example
literal,value numeric, character, and date nullstyle (literal,value)
mean numeric and date nullstyle (mean)
median numeric and date nullstyle (median)
medianwithoutaveraging any supported data type nullstyle (medianwithoutaveraging)
mode any supported date type nullstyle (mode)
imputed,table any supported data type nullstyle (imputed,tablename)

If date values are entered, the keyword DATE must precede the date value, which should not be enclosed in single quotes.

tablename
The name of the table to be transformed.
vartran
This parameter is required to run a variable transformation. The vartran parameter is always enclosed in single quotes.

General Parameters (separated by a semi-colon)

datatype
For all transformation types, the datatype parameter is used to cast the column to a desired database data type provided it is compatible with the transformed data.
The allowed output types include:
  • byteint
  • char
  • date
  • decimal
  • float
  • integer
  • smallint
  • time
  • timestamp
  • varchar
  • bigint
  • number
fallback
When set to true, this parameter requests a mirrored copy of the output table in the Teradata Database when outputstyle=table.
gensqlonly
When set to true, the SQL for the requested transformations is returned as a result set but not executed. When this parameter is not specified or is set to false, the SQL is executed but not returned.
indexcolumns
When set to true, requests the output table contain the index columns when outputstyle=table.
indexunique
When set to true, requests the output table contain a unique primary index when outputstyle=table.
lockingclause

Requests the generated SQL contain the given locking clause in the appropriate location depending on the output style.

An example of a locking clause when the output style defaults to select is:

LOCKING mydb.mytable FOR ACCESS;

multiset
When set to true, requests an output table that may contain duplicate rows when outputstyle=table.
noindex
When set to true, requests the output table contain no index columns when outputstyle=table.
outputdatabase
The database that will contain the resulting output table when outputstyle=table or view.
outputstyle
The allowed output styles are:
  • select
  • table
  • view
If outputstyle is not specified, the function generates a SELECT statement and does not create a table or view.
outputstyle={select||table||view}
outputtablename
The name of the output table when outputstyle=table or view.
whereclause
Requests the generated SQL contain the given WHERE clause in appropriate places in the generated SQL. This is independent of the output style requested.

Examples

The following examples demonstrate the Null Replacement transformation. These examples assume that the td_analyze function has been installed in a database named twm.

The first example operates on numeric data.

call twm.td_analyze('vartran','database=twm_source;tablename=twm_customer;keycolumns=cust_id;nullreplacement={nullstyle(literal,0),columns(age,income/inc)}{nullstyle(mean),columns(age/age1)}{nullstyle(median),columns(age/age2)}{nullstyle(medianwithoutaveraging),columns(age/age3)}{nullstyle(mode),columns(age/age4)}{nullstyle(imputed,twm_customer_analysis),columns(income)};');
This second example operates on date and character type data.
call twm.td_analyze('vartran','database=twm_source;tablename=twm_credit_acct;keycolumns=cust_id;nullreplacement={nullstyle(literal,DATE 1995-12-23),columns(acct_end_date/date1)}{nullstyle(literal,U),columns(account_active/char1)}{nullstyle(mean),columns(acct_end_date/date2)}{nullstyle(median),columns(acct_end_date/date2A)}{nullstyle(medianwithoutaveraging),columns(acct_end_date/date3)}{nullstyle(mode),columns(acct_end_date/date4)}{nullstyle(imputed,twm_checking_acct),columns(acct_end_date/date5)}{nullstyle(medianwithoutaveraging),columns(account_active/char2)}{nullstyle(mode),columns(account_active/char3)}{nullstyle(imputed,twm_checking_acct),columns(account_active/char4)};');