1.1 - 8.10 - PSALSA - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Product
Teradata Vantage
Release Number
1.1
8.10
Release Date
October 2019
Content Type
Programming Reference
Publication ID
B700-4003-079K
Language
English (United States)

You can use the personalized version of the SALSA algorithm, PSALSA, to evaluate the similarity between vertices on each side. While the SALSA algorithm creates a global score for each vertex, the PSALSA algorithm creates, for each hub vertex v i , a set of hub scores, h v j (i), and a set of authority scores, a x k (i).

A higher hub score indicates that vertex v j shares more connections with (or is closer to) v i . A higher authority score indicates that vertex a k is more important in building the closeness relationship with v i .

The updating rule for hub/authority score personalizing on v j is as follows:


Formulas for updating the hub and authority scores personalizing on v sub j with PSALSA algorithm (Machine Learning Engine Function PSALSA)

This allows random jumps, with a probability of ε, to the seed vertex at forward steps.

Personalized SALSA can be used in recommendation application where the users are the hub vertices and the products that must be recommended are the authority nodes, and there is an edge between the user and product nodes if there is a purchase record.

Using the power iteration method to get the scores personalizing each of the hub vertices is harder with PSALSA than with SALSA, because the power iteration must run once for each hub vertex.

The PSALSA function implements the personalized SALSA algorithm described in this paper:

Bahmani Bahman, Abdur Chowdhury, and Ashish Goel. "Fast incremental and personalized PageRank." Proceedings of the VLDB Endowment 4.3 (2010): 173-184.

This paper solves the problem through Monte Carlo simulation: From each hub vertex h i, the algorithm starts a random walk of L steps (L is a input parameter) and tracks the path of the random walk. After the walk stops, the algorithm computes the score of a hub vertex h j to h i,, and an authority vertex a k to h i as follows:

score_h ij = (2vh ij ) / L

score_a ik = (2va ik ) / L

where vh ij and va ik are the visit time of vertex h j as a hub vertex and a k as an authority vertex in the random walk path starting from the seed vertex h i .