Input
The input table, categorize_input, contains information about houses. All columns have numeric data types. The first three columns represent numeric values, but the other columns represent codes or categories. The following two tables are the input table itself and its schema description, respectively.
sn | price | lotsize | driveway | recroom | fullbase | gashw | airco | prefarea | homestyle |
---|---|---|---|---|---|---|---|---|---|
1 | 27000 | 1700 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
2 | 37900 | 3185 | 1 | 0 | 0 | 0 | 1 | 0 | 1 |
3 | 42000 | 4960 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
4 | 67000 | 5170 | 1 | 0 | 0 | 0 | 1 | 0 | 2 |
5 | 68000 | 9166 | 1 | 0 | 1 | 0 | 1 | 0 | 2 |
6 | 132000 | 3500 | 1 | 0 | 0 | 1 | 0 | 0 | 3 |
7 | 43000 | 5076 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
8 | 93000 | 3760 | 1 | 0 | 0 | 1 | 0 | 0 | 2 |
9 | 44500 | 3850 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
10 | 43000 | 3750 | 1 | 0 | 0 | 0 | 0 | 0 | 1 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
To check the input data variable types, use this query: \d categorize_input
SHOW TABLE categorize_input;
The query returns:
CREATE MULTISET TABLE ALICE.categorize_input ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( sn INTEGER, price FLOAT, lotsize FLOAT, driveway INTEGER, recroom INTEGER, fullbase INTEGER, gashw INTEGER, airco INTEGER, prefarea INTEGER, homestyle INTEGER) PRIMARY INDEX ( sn );
SQL Call
CREATE MULTISET TABLE categorize_output AS ( SELECT * FROM ConvertToCategorical( ON categorize_input PARTITION BY ANY USING TargetColumns('[4:9]') ) AS dt ) WITH DATA ;
Output
This query returns the output table:
SELECT * FROM categorize_output ORDER BY 1;
The output table looks like the input table, but its schema is different. To check the output data variable types, use this query: \d categorize_output
SHOW TABLE categorize_output;
The query returns:
CREATE MULTISET TABLE ALICE.categorize_output ,FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL, CHECKSUM = DEFAULT, DEFAULT MERGEBLOCKRATIO, MAP = TD_MAP1 ( sn INTEGER, price FLOAT, lotsize FLOAT, driveway INTEGER, recroom VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC, fullbase VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC, gashw VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC, airco VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC, prefarea VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC, homestyle VARCHAR(32000) CHARACTER SET UNICODE CASESPECIFIC) PRIMARY INDEX ( sn );
Download a zip file of all examples and a SQL script file that creates their input tables.