SimpleSampleSearch.java - Aster Execution Engine

Teradata Aster® Developer Guide

Product
Aster Execution Engine
Release Number
7.00.02
Published
July 2017
Language
English (United States)
Last Update
2018-04-13
dita:mapPath
xnl1494366523182.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
ffu1489104705746
lifecycle
previous
Product Category
Software

The SimpleSampleSearch class implements a search algorithm that tries to find a path between a given start vertex and end vertex in a graph; if a path is found, this prints the distance (in kilometers) of that path. The algorithm stops when it finds the shortest path in terms of the number of vertices, not necessarily the shortest in number of kilometers.

For example, if you could get from SFO to New York by either:

SFO -> Denver -> Chicago -> New York (around 5000 KM)

or

SFO -> Paris -> New York (around 15,000 KM)

This algorithm routes you through Paris because it is fewer hops (edges).

Each TransitEdge has a distance (in kilometers) that represents the distance from the source city to the destination city.

Here is the basic algorithm:

In iteration 0, the start vertex sends a message to each of its downstream neighbors (that is, to each vertex that the start vertex has directed edges going to). No other vertex in the graph sends any messages during that zeroeth iteration.

In subsequent iterations, each vertex that receives a message sends a message to each of its "downstream" neighbors, unless the vertex that received the message is the vertex that you are looking for, of course.

Once a vertex has sent all its messages, it issues a local halt and processes no more messages. (If a vertex continued to process messages, an infinite loop might result due to potential cycles in the graph.)

In the example below, because a vertex issues a local halt during the iteration in which it first receives a messages, if there are longer paths to that vertex, those paths are ignored. ("longer paths," in this context, means more edges, not more kilometers.) In other words, you keep iterating until you find the vertex that you are searching for. You finish that iteration, so there could be a tie because there could be multiple paths to the desired vertex that have the same number of edges in the path. If there is a tie, then you report length (in KM) of the path that has the fewest kilometers. However, if there is a path that has still fewer kilometers, but more edges, you will not detect it.

The algorithm stops under the following conditions:

  • The desired starting vertex and desired ending vertex are the same, in which case the function calls globalHalt() immediately and does not do any iterations after the zeroeth.
  • The program finds the desired ending vertex, in which case we call globalHalt().
  • There are no more messages to send—essentially, each possible path from the start to a leaf vertex has been traversed. (A leaf is a vertex that has no outgoing edges.)

In this example, we assume the following graph function invocation:

SELECT * FROM simplesamplesearch(
  ON vertices AS vertices_alias PARTITION BY vertex_ID
  ON edges AS edges_alias PARTITION BY src_ID
  start_vertex(1) end_vertex(9)
 )
 ORDER BY 1
;

The input is partitioned on the vertex key of the vertices table; the initialization method of the graph function is invoked once per vertex.

The start_vertex and end_vertex arguments in the SQL statement provide the vertex keys of the vertex at which the search starts and the target vertex.

The function returns a table with a single row if there is a path from the start vertex to the end vertex in the input graph. That output row has a single Distance attribute (column) that is set to the shortest distance (in kilometers) among the paths that are found.

This example does not search all possible paths, so this distance is not necessarily the shortest possible path.

If the program does not find the desired end_vertex, then it emits 0 rows.