pSALSA - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product
Aster Analytics
Release Number
6.21
Published
November 2016
Language
English (United States)
Last Update
2018-04-14
dita:mapPath
kiu1466024880662.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1021
lifecycle
previous
Product Category
Software

You can use the personalized version of the SALSA algorithm, pSALSA, to evaluate the similarity between vertices on each side. While the SALSA algorithm generates a global score for each vertex, the pSALSA algorithm generates, for each hub vertex v i , a set of hub scores, h v j (i), and a set of authority scores, a x k (i).

A higher hub score indicates that vertex v j shares more connections with (or is closer to) v i . A higher authority score indicates that vertex a k is more important in building the closeness relationship with v i .

The updating rule for hub/authority score personalizing on v j is as follows:



This allows random jumps, with a probability of ε, to the seed vertex at forward steps.

Personalized SALSA can be used in recommendation application where the users are the hub vertices and the products that must be recommended are the authority nodes, and there is an edge between the user and product nodes if there is a purchase record.

Using the power iteration method to get the scores personalizing each of the hub vertices is harder with pSALSA than with SALSA, because the power iteration must run once for each hub vertex.

The pSALSA function implements the personalized SALSA algorithm described in the following paper:

Bahmani Bahman, Abdur Chowdhury, and Ashish Goel. "Fast incremental and personalized PageRank." Proceedings of the VLDB Endowment 4.3 (2010): 173-184.

This paper solves the problem through Monte Carlo simulation: From each hub vertex h i, the algorithm starts a random walk of L steps (L is a input parameter) and tracks the path of the random walk. After the walk stops, the algorithm computes the score of a hub vertex h j to h i,, and an authority vertex a k to h i as follows:

score_h ij = (2vh ij ) / L

score_a ik = (2va ik ) / L

where vh ij and va ik are the visit time of vertex h j as a hub vertex and a k as an authority vertex in the random walk path starting from the seed vertex h i .