7.00.02 - Background - Aster Analytics

Teradata Aster® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Content Type
Programming Reference
User Guide
Publication ID
B700-1022-700K
Language
English (United States)
Last Update
2018-04-17

The goal of graph sampling is to identify a subgraph that preserves graph properties as well as possible. If the subgraph is a good representation of the graph, substituting the subgraph for the graph in analytic functions significantly decreases runtime without significantly decreasing accuracy.

Random walk sampling is a graph-sampling technique that randomly selects a starting vertex and then either explores a neighboring vertex or returns ("flies back") to the starting vertex. If the sampling process reaches a sink vertex (an isolated component or a loop), it randomly selects another vertex and continues until it reaches the desired sample size (the desired number of vertices).

The resulting subset graph includes the edges between sampled vertices and their nearest neighbors (edges that exist in the original graph), even if the sampling process did not explore those edges. Including those edges makes the subset graph more representative of the original graph.

For more information about sampling from large graphs, see: http://cs.stanford.edu/people/jure/pubs/sampling-kdd06.pdf