1.0 - 8.00 - KNNRecommender Example - Teradata Vantage

Teradata® Vantage Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.0
8.00
Release Date
May 2019
Content Type
Programming Reference
Publication ID
B700-4003-098K
Language
English (United States)

Input

The RatingTable has movie ratings from 50 users on approximately 2900 movies, with an average of about 150 ratings for each user. The 10 possible ratings range from 0.5 to 5 in steps of 0.5. A higher number indicates a better rating.

RatingTable: ml_ratings
userid itemid rating
1 1 5
1 2 3
1 10 3
1 32 4
1 34 4
1 47 3
1 50 4
1 62 4
1 150 4
1 153 3
1 160 3
1 161 4
1 165 4
1 185 3
... ... ...

SQL Call

KNNRecommender uses the input data to create three model tables: the weights model ('ml_weights'), the bias model table ('ml_bias') and the optional nearest items or neighbors table ('ml_itemngbrs').

DROP TABLE ml_weights;
DROP TABLE ml_bias;
DROP TABLE ml_itemngbrs;

SELECT * FROM KNNRecommender (
  ON ml_ratings AS RatingTable
  OUT TABLE WeightModelTable (ml_weights)
  OUT TABLE BiasModelTable (ml_bias)
  OUT TABLE NearestItemsTable (ml_itemngbrs)
  USING
  UserIDColumn ('userid')
  ItemIDColumn ('itemid')
  RatingColumn ('rating')
  K (15)
  MaxIterNum (20)
  StopThreshold (0.0002)
  LearningRate (0.001)
) AS dt;

Output

The rmse value is output for each of the 20 iterations. The null iteration or first row of the table shows the rmse of the default initialized model.

The ml_weights table is not shown because it is in compressed binary format.
Output Table
iternum rmse
  0.4825
0 0.4803
1 0.4780
2 0.4757
3 0.4734
4 0.4710
5 0.4686
6 0.4661
7 0.4636
8 0.4611
9 0.4585
10 0.4560
11 0.4534
12 0.4508
13 0.4482
14 0.4455
15 0.4429
16 0.4403
17 0.4376
18 0.4350

This query returns the following table:

SELECT * FROM ml_itemngbrs ORDER BY 1;

The sij value (similarity between itemi and itemj) in ml_itemngbrs is the default Pearson correlation coefficient.

ml_itemngbrs
itemi itemj sij
1 145 1
1 1235 1
1 2867 1
1 1086 1
1 2067 1
1 7143 1
1 97304 1
1 1882 0.999607184150001
1 2723 0.999351517785325
1 1081 0.998812351124897
1 2301 0.998812351124897
1 26614 0.998812351124897
1 7458 0.998330805669821
1 3624 0.997986152770463
1 1094 0.99714268802795
2 160 1
2 63082 1
2 76251 1
2 4874 1
2 6934 1
2 8636 1
2 33004 1
2 112 0.999314833766767
2 1246 0.999282588328635
2 1259 0.998868137724437
2 349 0.998268396969243
2 223 0.998124077136759
2 8644 0.997785157856609
2 4370 0.997458699830735
2 35836 0.997054485501582
... ... ...

This query returns the following table:

SELECT * FROM ml_bias ORDER BY 1, 2;
ml_bias
label id value
G   3.53538298436258
I 1 3.78125
I 2 3
I 3 2
I 5 3.16666666666667
I 6 3.65
I 7 3
I 9 3
I 10 3.65
... ... ...