Split Input into Training and Testing Data Sets - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

This code divides the 150 data rows into a training data set (70%) and a testing data set (30%). The training data set is input for the NeuralNet function.

DROP TABLE IF EXISTS breast_cancer_train;

CREATE TABLE breast_cancer_train DISTRIBUTE BY hash(samplecode) AS
SELECT * from breast_cancer_data ORDER BY samplecode ASC LIMIT 489;

SELECT * FROM breast_cancer_train ORDER BY samplecode;
Alternatively, you can do the preceding task with the Sample or RandomSample function.
NeuralNet Example Train Table breast_cancer_train (Columns 1-5)
samplecode clumpthickness uniformityofcellsize uniformityofcellshape marginaladhesion
61634 5 4 3 1
63375 9 1 2 6
76389 10 4 7 2
95719 6 10 10 10
128059 1 1 1 1
142932 7 6 10 5
144888 8 10 10 8
145447 8 4 4 1
160296 5 8 8 10
167528 4 1 1 1
169356 3 1 1 1
183913 1 2 2 1
... ... ... ... ...
NeuralNet Example Train Table breast_cancer_train (Columns 6-11)
singleepithelialcell barenuclei blandchromatin normalnucleoli mitoses class
2   2 3 1 2
4 10 7 7 2 4
2 8 6 1 1 4
8 10 7 10 7 4
2 5 5 1 1 2
3 10 9 10 2 4
5 10 7 8 1 4
2 9 3 3 1 4
5 10 8 10 3 4
2 1 3 6 1 2
2   3 1 1 2
2 1 1 1 1 2
... ... ... ... ... ...
DROP TABLE IF EXISTS breast_cancer_test;
CREATE TABLE breast_cancer_test DISTRIBUTE BY hash(samplecode) AS
SELECT * FROM breast_cancer_data ORDER BY samplecode DESC LIMIT 210;
SELECT * FROM breast_cancer_train ORDER BY samplecode;
NeuralNet Example Train Table breast_cancer_test (Columns 1-5)
samplecode clumpthickness uniformityofcellsize uniformityofcellshape marginaladhesion
1222936 8 7 8 7
1223003 5 3 3 1
1223282 1 1 1 1
1223306 3 1 1 1
1223426 1 1 1 1
1223543 1 2 1 3
1223793 6 10 7 7
1223967 6 1 3 1
... ... ... ... ...
NeuralNet Example Train Table breast_cancer_test (Columns 6-11)
singleepithelialcell barenuclei blandchromatin normalnucleoli mitoses class
5 5 5 10 2 4
2 1 2 1 1 2
2 1 2 1 1 2
2 4 1 1 1 2
2 1 3 1 1 2
2 1 1 2 1 2
6 4 8 10 2 4
2 1 3 1 1 2
... ... ... ... ... ...