Example: Cal Housing Data Set
Starting Data (cal_housing_ex_raw)
Only part of the dataset is shown in this example.
id MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude MedHouseVal ------- ----- ------ ------- ----- ------- ------- ------- ------ 18760 3.8323 16 5.99782 1.07625 1414 3.08061 40.6 -122.25 1.283 6044 2.1141 27 3.8552 1.0724 1024 4.63348 34.05 -117.74 1.109 14870 1.858 23 3.9012 1.07711 1025 2.46988 32.64 -117.11 0.675 244 2.3906 44 4.86557 1.16393 2269 3.71967 37.78 -122.22 1.117 14365 2.1635 43 4.53333 0.995238 392 1.86667 32.72 -117.23 2.442 12342 2.5885 28 6.26791 1.37239 3470 2.58955 33.84 -116.53 1.59 17768 2.7562 29 4.52964 1.03995 3572 4.60309 37.35 -121.85 1.601 6558 6.827 36 7.02143 1.03571 1897 2.71 34.2 -118.11 3.594 9454 1.2281 25 5.50398 1.15385 991 2.62865 39.77 -123.23 0.603 5328 2.7679 23 3.03868 1.06446 2031 1.63658 34.04 -118.45 2.775 17157 9.7796 20 6.67808 0.917808 324 2.21918 37.43 -122.21 5.00001 18099 5.7528 27 6.43736 1.02733 1259 2.86788 37.32 -122.04 4.314 2313 2.4861 15 5.46792 1.04528 649 2.44906 36.94 -119.7 0.863 11670 4.5 28 5.10212 1.04359 2112 2.63014 33.84 -118.01 2.021 3593 6.6537 32 6.33092 0.995169 1285 3.10386 34.24 -118.48 2.676
Standardize Data Using TD_ScaleFit with Housing Data
SELECT * FROM TD_ScaleFit( ON cal_housing_ex_raw AS InputTable OUT VOLATILE TABLE OutputTable(scaleFitOut_cal_ex) USING TargetColumns('medinc', 'houseage', 'averooms', 'avebedrms', 'population', 'aveoccup', 'latitude', 'longitude') ScaleMethod('STD') ) AS dt2;
Output from TD_ScaleFit
TD_STATTYPE_SCLFIT MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude ------- ----- ------ ------- ----- ------ ------- ----- ------ -- min 1.0472 7 2.24752 0.742574 47 1.63658 32.64 -123.35 max 10.7721 52 8.89305 2.56522 4145 5.45536 40.99 -116.01 sum 261.401 2134 349.412 74.0563 91566 200.592 2459.39 -8256.05 count 69 69 69 69 69 69 69 69 null 0 0 0 0 0 0 0 0 avg 3.78842 30.9275 5.06394 1.07328 1327.04 2.90714 35.6433 -119.653 variance 5.40242 142.803 1.55253 0.0458083 861633 0.618171 5.17425 4.73921 std 2.30741 11.8631 1.23694 0.212472 921.491 0.780521 2.25815 2.16114 ustd 2.32431 11.95 1.246 0.214029 928.242 0.786239 2.2747 2.17697 multiplier 1 1 1 1 1 1 1 1 intercept 0 0 0 0 0 0 0 0 location 3.78842 30.9275 5.06394 1.07328 1327.04 2.90714 35.6433 -119.653 scale 2.30741 11.8631 1.23694 0.212472 921.491 0.780521 2.25815 2.16114 globalscale_false nan nan nan nan nan nan nan nan ScaleMethodNumberMapping: [0:mean,1:sum,2:ustd, 3:std,4:range,5:midrange, 6:maxabs,7:rescale] 3 3 3 3 3 3 3 3 missvalue_KEEP nan nan nan nan nan nan nan nan
Standardize Data Using TD_ScaleTransform with Housing Data
CREATE MULTISET TABLE cal_housing_ex_scaled AS ( SELECT * from TD_ScaleTransform( ON cal_housing_ex_raw AS InputTable ON scaleFitOut_cal_ex AS FitTable DIMENSION USING Accumulate('id', 'MedHouseVal') ) AS dt1 ) with data;
Output Using TD_ScaleTransform
id MedHouseVal MedInc HouseAge AveRooms AveBedrms Population AveOccup Latitude Longitude ------- ----- ------ ------- ----- ------ ------- ----- ------------------ --------- ----------- 244 1.117 -0.605796 1.10194 -0.160367 0.426668 1.02221 1.04102 0.946201 -1.18785 670 1.922 -0.00308521 0.427582 -0.129699 -0.530672 -0.761856 -0.21256 0.906345 -1.16008 686 1.578 -0.152084 -0.0781865 -0.625426 -0.513581 -0.685892 -0.533101 0.906345 -1.14157 1754 1.651 -0.0263147 0.596172 0.454207 -0.0272726 0.0683203 -0.0827654 1.01705 -1.23412 2018 0.578 -1.18801 -1.34261 0.0194523 0.182239 0.0607239 1.00044 0.476791 -0.0680666 2313 0.863 -0.564408 -1.34261 0.3266 -0.131765 -0.735812 -0.586889 0.574215 -0.0217947 2833 0.6 -1.05561 -0.0781865 -2.27692 -1.55646 -1.25671 -1.58083 -0.112186 0.292854 3593 2.676 1.24178 0.0904031 1.02428 -0.367626 -0.0456255 0.252048 -0.621452 0.542722 3687 1.741 -0.383816 -1.67979 -0.849458 0.108 0.718354 1.0835 -0.630308 0.593621 4761 1.516 -0.304592 1.52341 -1.08879 -0.83614 -0.715193 -0.842998 -0.710019 0.602876 5202 1 -0.307539 1.10194 -0.379623 -0.570271 -0.141123 0.595366 -0.754303 0.635266 5233 0.955 -0.895906 0.680467 -0.387272 -0.202806 -0.12593 2.13021 -0.754303 0.653775 5300 3.5 -0.9558 -1.00543 -1.54881 -0.130818 2.63047 -0.601956 -0.696734 0.556604 5328 2.775 -0.442279 -0.66825 -1.63731 -0.0414887 0.763932 -1.62783 -0.710019 0.556604 5611 1.587 -0.712366 -0.415366 -1.27572 0.0129604 0.860515 0.76487 -0.820729 0.639893
Example: Using TD_SVM SQL Using Regression Model
CREATE VOLATILE TABLE svm_model_cal_housing AS (
SELECT * FROM TD_SVM (
ON cal_housing_ex_scaled AS InputTable
USING
InputColumns('medinc', 'houseage', 'averooms', 'avebedrms', 'population', 'aveoccup', 'latitude', 'longitude')
ResponseColumn('MedHouseVal')
ModelType('Regression')
BatchSize(10)
MaxIterNum(300)
) AS dt
) WITH DATA
ON COMMIT PRESERVE ROWS;
Output from TD_SVM
attribute predictor estimate value ------- ----- ------ ------- -------------- -17 Kernel LINEAR -16 Sparse FALSE -13 LocalSGD Iterations 0 -12 Nesterov FALSE -11 Momentum 0 -10 Learning Rate (Final) 0.329 -9 Learning Rate (Initial) 0.05 -8 Number of Iterations 134 CONVERGED -7 Alpha 0.15 Elasticnet -6 Regularization 0.02 ENABLED -5 BIC -35.12 -4 AIC -55.23 -3 Number of Observations 69 -2 MSE 0.346 -1 Loss Function EPSILON_INSENSITIVE 0 (Intercept) 2.076 1 MedInc 0.992 2 HouseAge 0.131 3 AveRooms -0.086 4 AveBedrms 0.076 5 Population 0.181 6 AveOccup -0.246 7 Latitude -0.716 8 Longitude -0.55
Example: TD_SVM Classification Credit Data Set
Starting Data (credit_ex_merged)
id A1 A2 A7 A10 A13 A14 A0_b A0_a A3_y A3_u A4_p A4_g A5_k A5_cc A5_d A5_c A5_aa A5_m ------- ----- ------ ------- ----- ------- ------- ------- ----- ----- 61 0.218228 2.17724 0.142986 1.25309 0.238997 -0.130455 1 0 0 1 0 1 0 0 1 0 0 0 297 -0.453325 -0.0247979 -0.404925 -0.77231 0.886307 -0.225685 0 1 0 1 0 1 0 0 0 0 0 0 631 -1.13175 -0.77234 -0.199458 -0.547266 -0.761392 -0.225817 0 1 0 1 0 1 0 0 0 1 0 0 122 -0.657541 1.20223 0.00600835 0.802998 -0.643699 -0.120063 1 0 0 1 0 1 0 0 0 0 0 0 202 0.98698 -0.0719913 1.10183 2.82839 -0.54366 -0.0673184 1 0 0 1 0 1 0 0 0 1 0 0 602 -0.976627 0.935114 -0.462456 -0.77231 -0.761392 -0.180437 1 0 0 1 0 1 0 0 0 0 1 0 543 -0.73903 -0.77234 -0.667923 -0.547266 3.71093 -0.218714 1 0 1 0 1 0 0 0 0 0 0 0 160 1.11756 1.01345 -0.199458 -0.77231 -1.06151 -0.11638 0 1 0 1 0 1 0 0 0 0 0 0 646 0.684585 -0.732698 -0.518617 -0.77231 0.18015 -0.22608 1 0 1 0 1 0 0 0 1 0 0 0 121 1.78911 -0.402344 0.00600835 1.92822 1.18054 0.103676 1 0 0 1 0 1 0 0 0 0 0 1 127 1.76457 0.635908 1.10183 0.577954 0.974576 6.4949 0 1 0 1 0 1 0 0 0 0 0 0 425 -1.1563 -0.874278 -0.678881 -0.77231 -0.726084 -0.226474 0 1 1 0 1 0 0 0 0 0 0 0 314 0.528478 -0.402344 0.142986 -0.77231 -0.0552354 -0.226474 1 0 1 0 1 0 0 0 0 0 0 0 466 -0.518125 -0.709101 0.588164 -0.322222 1.82785 0.0365933 1 0 0 1 0 1 0 0 0 0 0 0 588 -0.665395 -0.84313 -0.0624806 -0.77231 -1.23216 -0.226474 1 0 0 1 0 1 0 0 0 1 0 0 15 0.70913 -0.0408437 -0.610392 1.47813 0.650921 -0.226474 1 0 1 0 1 0 1 0 0 0 0 0 632 0.0866665 -0.779891 -0.655595 -0.77231 0.250766 -0.226474 1 0 0 1 0 1 0 0 1 0 0 0 464 -1.04241 -0.20602 -0.678881 -0.77231 -0.172928 -0.226343 1 0 1 0 1 0 0 0 0 0 0 0 618 -0.576051 -0.717596 -0.565189 -0.547266 -0.0552354 -0.225027 0 1 0 1 0 1 0 0 0 0 0 0 442 -0.18333 -0.307958 0.0854555 -0.547266 0.415536 -0.225159 0 1 0 1 0 1 0 0 0 0 0 1 344 -0.625141 -0.402344 -0.655595 -0.77231 -0.643699 0.32702 1 0 0 1 0 1 0 0 0 0 0 0 441 -0.632996 -0.527879 -0.678881 -0.547266 -0.0552354 -0.219503 0 1 0 1 0 1 0 0 0 0 0 0 164 0.79062 -0.661908 -0.267947 -0.77231 1.30412 -0.226474 1 0 1 0 1 0 0 0 1 0 0 0 440 0.700294 -0.819533 -0.678881 1.47813 -0.0552354 -0.224107 0 1 0 1 0 1 0 0 0 0 0 0 241 1.06062 3.18435 4.80023 1.70317 -1.23216 -0.0686337 1 0 0 1 0 1 0 0 0 0 0 0 517 -0.576051 -0.787442 0.0402528 0.577954 -0.00815825 -0.180832 1 0 1 0 1 0 0 0 0 0 0 0 490 -1.3124 -0.732698 -0.199458 0.35291 0.83923 -0.135716 1 0 0 1 0 1 0 0 0 1 0 0 358 -0.0763129 1.86293 -0.644636 -0.77231 -1.23216 -0.188856 1 0 0 1 0 1 0 0 1 0 0 0 96 0.569714 2.24048 0.416942 -0.77231 -0.761392 -0.226474 1 0 1 0 1 0 0 0 0 1 0 0 368 -0.674231 -0.591118 -0.530945 -0.77231 -0.526006 -0.217661 1 0 1 0 1 0 0 0 0 0 1 0 47 0.242773 -0.0323489 0.975811 -0.0971781 0.59796 -0.187014 1 0 0 1 0 1 0 1 0 0 0 0 134 -0.248129 1.67416 0.690897 -0.322222 -1.23216 0.431195 0 1 0 1 0 1 0 0 0 0 0 0 125 3.76058 0.163975 2.88254 -0.547266 -1.23216 -0.226474 1 0 0 1 0 1 0 0 0 0 0 0 561 0.0709576 0.352749 0.177231 1.02804 0.709768 -0.0686337 1 0 0 1 0 1 0 1 0 0 0 0 552 1.13425 -0.866727 -0.667923 -0.547266 2.06323 -0.226474 1 0 0 1 0 1 0 0 0 0 0 0 128 0.667894 0.0695887 0.00600835 0.577954 -1.23216 -0.178201 1 0 0 1 0 1 0 0 0 1 0 0 590 -0.772412 -0.725147 -0.4967 -0.77231 -0.290621 -0.226474 1 0 1 0 1 0 0 1 0 0 0 0 201 0.561859 -0.402344 0.55392 0.802998 -0.349467 -0.0594264 1 0 0 1 0 1 1 0 0 0 0 0 117 -0.46118 1.48539 -0.267947 1.92822 -0.526006 -0.151895 0 1 0 1 0 1 0 0 0 0 1 0 239 -0.959936 -0.834635 -0.599434 1.70317 -0.761392 -0.213452 0 1 0 1 0 1 0 0 0 0 0 0 279 -0.395399 -0.77234 -0.633678 -0.547266 0.368459 -0.168073 1 0 0 1 0 1 0 0 0 0 0 0 645 -1.21422 -0.253214 -0.587106 -0.77231 -0.408314 -0.226211 1 0 0 1 0 1 0 0 0 0 0 0 267 -1.29571 -0.716652 -0.655595 -0.77231 -0.0552354 -0.226474 1 0 0 1 0 1 0 0 0 0 0 1 124 -0.100858 -0.213571 -0.450128 -0.77231 0.415536 -0.226474 0 1 0 1 0 1 0 0 0 0 0 0
TD_SVM SQL Call
SELECT * FROM TD_SVM (
ON credit_ex_merged AS InputTable
USING
InputColumns('[1:39]')
ResponseColumn('Outcome')
ModelType('Classification')
BatchSize(10)
MaxIterNum(300)
) AS dt;
TD_SVM Output
attribute predictor estimate value ------- ----- ------ ------- --------- --------- -17 Kernel LINEAR -16 Sparse FALSE -13 LocalSGD Iterations 0 -12 Nesterov FALSE -11 Momentum 0 -10 Learning Rate (Final) 0.30711962 -9 Learning Rate (Initial) 0.05 -8 Number of Iterations 145 CONVERGED -7 Alpha 0.15 Elasticnet -6 Regularization 0.02 ENABLED -5 BIC 58.2305273 -4 AIC -13.137058 -3 Number of Observations 44 -2 MSE 0.12042221 -1 Loss Function HINGE 0 (Intercept) 0.14367754 1 A1 0.55515426 2 A2 -0.0133669 3 A7 0.66937255 4 A10 0.55171149 5 A13 0.67972232 6 A14 0.42510543 7 A0_b 0 8 A0_a 0 9 A3_y -0.2740879 10 A3_u 0.30813429 11 A4_p -0.2740879 12 A4_g 0.30813429 13 A5_k 0.17839376 14 A5_cc 0 15 A5_d 0 16 A5_c 0.39148682 17 A5_aa 0.09399927 18 A5_m -0.6514652 19 A5_q 0.6736081 20 A5_w -0.5230071 21 A5_e 0.07129889 22 A5_ff -0.5529271 23 A5_j -0.2334109 24 A5_x 0.11910849 25 A5_i 0 26 A6_v -0.5137543 27 A6_h 0 28 A6_bb 0.06200932 29 A6_z 0 30 A6_ff -0.5650936 31 A6_j 0.90342215 32 A8_t 0.6748785 33 A8_f -0.6408321 34 A9_t -0.1446379 35 A9_f 0.18255114 36 A11_t 0.26702837 37 A11_f -0.232246 38 A12_g -0.3717838 39 A12_s 0.40738628