ConvertToCategorical Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The input table, categorize_input, contains information about houses. All columns have numeric data types. The first three columns represent numeric values, but the other columns represent codes or categories. The following two tables are the input table itself and its schema description, respectively.

categorize_input
sn price lotsize driveway recroom fullbase gashw airco prefarea homestyle
1 27000 1700 1 0 0 0 0 0 1
2 37900 3185 1 0 0 0 1 0 1
3 42000 4960 1 0 0 0 0 0 1
4 67000 5170 1 0 0 0 1 0 2
5 68000 9166 1 0 1 0 1 0 2
6 132000 3500 1 0 0 1 0 0 3
7 43000 5076 0 0 0 0 0 0 1
8 93000 3760 1 0 0 1 0 0 2
9 44500 3850 1 0 0 0 0 0 1
10 43000 3750 1 0 0 0 0 0 1
... ... ... ... ... ... ... ... ... ...

To check the input data variable types, use this query: \d categorize_input

The query returns:

Table "public"."categorize_input"
Column    | Type    | Modifiers 
-------------------------------
sn        | integer | 
price     | real    | 
lotsize   | real    | 
driveway  | integer | 
recroom   | integer | 
fullbase  | integer | 
gashw     | integer | 
airco     | integer | 
prefarea  | integer | 
homestyle | integer |

Table Type:
fact

Distribution Key:
sn

Compression Level:
none

Storage Type:
row

Persistence:
permanent

SQL Call

CREATE MULTISET TABLE categorize_output AS (
  SELECT * FROM ConvertToCategorical (
    ON categorize_input PARTITION BY ANY
    USING
    Columns ('[4:9]')
  ) AS dt
) WITH DATA;

Output

This query returns the output table:

SELECT * FROM categorize_output ORDER BY 1;

The output table looks like the input table, but its schema is different. To check the output data variable types, use this query: \d categorize_output

The query returns:

Table "public"."categorize_output"
Column    | Type              | Modifiers 
-----------------------------------------
sn        | integer           | 
price     | real              | 
lotsize   | real              | 
driveway  | integer           | 
recroom   | character varying | 
fullbase  | character varying | 
gashw     | character varying | 
airco     | character varying | 
prefarea  | character varying | 
homestyle | character varying |

Table Type:
fact

Distribution Key:
sn

Compression Level:
none

Storage Type:
row

Persistence:
permanent