Input
The InputTable has movie ratings from 50 users on approximately 2900 movies, with an average of about 150 ratings for each user. The 10 possible ratings range from 0.5 to 5 in steps of 0.5. A higher number indicates a better rating.
userid | itemid | rating |
---|---|---|
1 | 1 | 5 |
1 | 2 | 3 |
1 | 10 | 3 |
1 | 32 | 4 |
1 | 34 | 4 |
1 | 47 | 3 |
1 | 50 | 4 |
1 | 62 | 4 |
1 | 150 | 4 |
1 | 153 | 3 |
1 | 160 | 3 |
1 | 161 | 4 |
1 | 165 | 4 |
1 | 185 | 3 |
... | ... | ... |
SQL Call
KNNRecommender uses the input data to create three model tables: the weights model ('ml_weights'), the bias model table ('ml_bias') and the optional nearest items or neighbors table ('ml_itemngbrs').
DROP TABLE ml_weights; DROP TABLE ml_bias; DROP TABLE ml_itemngbrs;
SELECT * FROM KNNRecommender ( ON ml_ratings AS InputTable OUT TABLE WeightModelTable (ml_weights) OUT TABLE BiasModelTable (ml_bias) OUT TABLE NearestItemsTable (ml_itemngbrs) USING UserIDColumn ('userid') ItemIDColumn ('itemid') RatingColumn ('rating') K (15) MaxIterNum (20) StopThreshold (0.0002) LearningRate (0.001) ) AS dt;
Output
The rmse value is output for each of the 20 iterations. The null iteration or first row of the table shows the rmse of the default initialized model.
The ml_weights table is not shown because it is in compressed binary format.
iternum rmse ------- ----------------- NULL 0.201160929445398 0 0.200122952666973 1 0.199094556858257 2 0.198024742312269 3 0.196912337420418 4 0.195756186274468 5 0.19455515462393 6 0.193308136349661 7 0.19201406046623 8 0.190671898661581 9 0.189280673377762 10 0.187839466430794 11 0.186347428161151 12 0.184803787098705 13 0.183207860117312 14 0.181559063044462 15 0.179856921680543 16 0.178101083170261 17 0.176291327655719 18 0.174427580126542 19 0.172509922367382
SELECT * FROM ml_itemngbrs ORDER BY 1;
The sij value (similarity between itemi and itemj) in ml_itemngbrs is the default Pearson correlation coefficient.
itemi itemj sij ----- ----- ---------------------- 1 329 0.948683298050514 1 39 0.4472135954999582 1 648 0.7071067811865475 1 225 0.7071067811865475 1 597 0.7071067811865475 1 296 0.7071067811865475 1 300 0.8660254037844385 1 21 0.7071067811865475 1 480 0.7071067811865475 1 265 0.7071067811865475 1 11 0.7071067811865475 1 288 0.9999999999999998 1 165 0.9999999999999998 1 380 0.4472135954999582 1 515 0.7071067811865475 10 161 0.8944271909999159 10 356 0.0 10 160 0.948683298050514 10 380 0.0 10 592 0.0 10 357 0.9999999999999998 10 434 0.0 ... ... ...
SELECT * FROM ml_bias ORDER BY 1, 2;
label id value ----- ----- ------------------ G NULL 3.840092699884125 I 1 4.0 I 2 3.0 I 5 3.0 I 6 4.0 I 10 3.5 I 11 3.5 I 14 4.0 I 16 5.0 I 17 5.0 I 19 3.0 I 21 4.0 I 32 4.5 I 34 4.5 I 36 3.0 I 39 3.3333333333333335 I 44 2.0 I 45 4.0 I 47 3.3333333333333335 I 50 4.5 ... ... ...
Download a zip file of all examples and a SQL script file that creates their input tables from the attachment in the left sidebar.