ConvertToCategorical Example | Teradata Vantage - ConvertToCategorical Example - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
9.02
9.01
2.0
1.3
Published
February 2022
Language
English (United States)
Last Update
2022-02-10
dita:mapPath
rnn1580259159235.ditamap
dita:ditavalPath
ybt1582220416951.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

Input

The input table, categorize_input, contains information about houses. All columns have numeric data types. The first three columns represent numeric values, but the other columns represent codes or categories. The following two tables are the input table itself and its schema description, respectively.

categorize_input
sn price lotsize driveway recroom fullbase gashw airco prefarea homestyle
1 27000 1700 1 0 0 0 0 0 1
2 37900 3185 1 0 0 0 1 0 1
3 42000 4960 1 0 0 0 0 0 1
4 67000 5170 1 0 0 0 1 0 2
5 68000 9166 1 0 1 0 1 0 2
6 132000 3500 1 0 0 1 0 0 3
7 43000 5076 0 0 0 0 0 0 1
8 93000 3760 1 0 0 1 0 0 2
9 44500 3850 1 0 0 0 0 0 1
10 43000 3750 1 0 0 0 0 0 1
... ... ... ... ... ... ... ... ... ...

To check the input data variable types, use this query: \d categorize_input

SHOW TABLE categorize_input;

The query returns:

CREATE MULTISET TABLE ALICE.categorize_input ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      sn INTEGER,
      price FLOAT,
      lotsize FLOAT,
      driveway INTEGER,
      recroom INTEGER,
      fullbase INTEGER,
      gashw INTEGER,
      airco INTEGER,
      prefarea INTEGER,
      homestyle INTEGER)
PRIMARY INDEX ( sn );

SQL Call

CREATE MULTISET TABLE categorize_output AS (
    SELECT * FROM ConvertToCategorical(
    ON  categorize_input PARTITION BY ANY
    USING
    TargetColumns('[4:9]')
  ) AS dt
) WITH DATA ;

Output

This query returns the output table:

SELECT * FROM categorize_output ORDER BY 1;

The output table looks like the input table, but its schema is different. To check the output data variable types, use this query: \d categorize_output

SHOW TABLE categorize_output;

The query returns:

CREATE MULTISET TABLE ALICE.categorize_output ,FALLBACK ,
     NO BEFORE JOURNAL,
     NO AFTER JOURNAL,
     CHECKSUM = DEFAULT,
     DEFAULT MERGEBLOCKRATIO,
     MAP = TD_MAP1
     (
      sn INTEGER,
      price FLOAT,
      lotsize FLOAT,
      driveway INTEGER,
      recroom VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC,
      fullbase VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC,
      gashw VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC,
      airco VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC,
      prefarea VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC,
      homestyle VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC)
PRIMARY INDEX ( sn );

Download a zip file of all examples and a SQL script file that creates their input tables.