1.1 - 8.10 - Closeness Background - Teradata Vantage

Teradata Vantage™ - Machine Learning Engine Analytic Function Reference

Teradata Vantage
Release Number
Release Date
October 2019
Content Type
Programming Reference
Publication ID
English (United States)

Closeness and k-degree scores are fundamental distance-based centrality metrics used in network structure analysis. Both measure the time needed to spread information from a source vertex to a set of target vertices.

The closeness score is classically defined for each vertex v as either the inverse sum or the inverse average of the shortest distances from v to all other reachable vertices u. The classical definition does not apply to disconnected graphs; alternative definitions of closeness have been proposed for them.

The Closeness function applies the classical definition of closeness to connected graphs and an alternative definition to disconnected graphs. The alternative definition that the function uses adds 0 to the sum for each unreachable target vertex, which is consistent with the classic definition, because the inverse distance is effectively 0 for a disconnected graph.

The k-degree score is defined for vertex v as the number of vertices whose distance from v is less than or equal to k.

The Closeness function uses a hybrid distributed all pairs shortest path (APSP) algorithm to calculate the shortest distances from each specified source vertex to each specified target vertex and then aggregates these shortest distances into closeness and k-degree scores for each source vertex. By restricting the number of parallel single node shortest path (SNSP) executions to groups of P vertices, the APSP algorithm enables a trade-off between time and memory usage. The APSP algorithm completes when N/P of these groups have completed, where N is the number of vertices in the graph. (For more information, see AllPairsShortestPath (ML Engine).)