PSALSA Example 1: User Similarity in Social Network without Edge Weight - Teradata Vantage

Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
8.00
1.0
Published
May 2019
Language
English (United States)
Last Update
2019-11-22
dita:mapPath
blj1506016597986.ditamap
dita:ditavalPath
blj1506016597986.ditaval
dita:id
B700-4003
lifecycle
previous
Product Category
Teradata Vantageā„¢

This example uses the arguments MaxHubNum and MaxAuthorityNum to output a maximum of two hub and two authority users.

Input

vertices: users_vertex
userid username
1 John
2 Carla
3 Simon
4 Celine
5 Winston
6 Diana
edges: users_edges
followers leaders likes
Carla Celine 7
Carla Diana 12
Celine Diana 4
John Carla 10
John Celine 5
John Diana 6
John Simon 2
Simon Diana 1
Winston Diana 10

The likes column is not used as edgeweight in this example.

SQL Call

SELECT * FROM PSALSA(
  ON users_vertex AS vertices PARTITION BY username
  ON users_edges AS edges PARTITION BY followers
  USING
  SourceKey ('followers ')
  TargetKey ('leaders ')
  MaxHubNum (2)
  MaxAuthorityNum (2)
  TeleportProb (0.15)
  RandomWalkLength (1000)
) AS dt ORDER BY followers;

Output

The output shows that the users John and Simon are similar to Carla. John is more similar, as he has a higher hub_score. The output varies with every run.

followers hub_followers hub_score authority_leaders authority_score
Carla John 0.354    
Carla Simon 0.146    
Carla     Simon 0.0898203592814371
Carla     Carla 0.0778443113772455
Celine John 0.314    
Celine Carla 0.19    
Celine     Celine 0.148
Celine     Simon 0.084
John Carla 0.190291262135922    
John Simon 0.116504854368932    
Simon John 0.318    
Simon Carla 0.18    
Simon     Celine 0.148
Simon     Simon 0.082
Winston John 0.316    
Winston Carla 0.19    
Winston     Celine 0.146
Winston     Simon 0.092