Dimensional Input Example - Aster Analytics

Teradata AsterĀ® Analytics Foundation User GuideUpdate 2

Product
Aster Analytics
Release Number
7.00.02
Published
September 2017
Language
English (United States)
Last Update
2018-04-17
dita:mapPath
uce1497542673292.ditamap
dita:ditavalPath
AA-notempfilter_pdf_output.ditaval
dita:id
B700-1022
lifecycle
previous
Product Category
Software

This example creates a SQL-MapReduce function, closest_store, for a retailer of mobile phone accessories. The function takes data from purchases made by mobile phone and finds the closest retail store at the time of purchase. This type of result cannot be easily computed using only cogroup capabilities, because the data sets must be related using a proximity join as opposed to an equijoin.

The function inputs are:

  • phone_purchases, which contains, for each mobile phone accessory purchase made from the phone, the normalized spatial coordinates of the phone at the time of the purchase
  • stores, which contains the normalized spatial coordinates of each retail store
Dimensional Example Tables

The function call is:

SELECT pid, sid FROM closest_store (
  ON phone_purchases PARTITION BY ANY,
  ON stores DIMENSION
);

The following figure shows how SQL-MapReduce runs the preceding function call.

How Dimensional Inputs Work in SQL-MapReduce

The closest_store function receives the result of a cogroup operation on the phone_purchases data set and the dimensional input data set stores. Conceptually, the operation has these steps:

  1. Group each input data set as specified by its PARTITION BY clause:
    • Group the phone_purchases input as it is stored in the database (as specified by the PARTITION BY ANY clause).
    • Group the stores input into a single group (as specified by the DIMENSION clause).
  2. Combine the groups, using what is essentially a Cartesian join.

    The result of this cogroup operation is a nested relation. Conceptually, each tuple of the nested relation contains an arbitrary group of phone purchases concatenated with the single group of retail stores.

    The middle box in the preceding figure shows the result of the cogroup operation.

  3. Invoke the closest_store function for each cogroup tuple.

    At each invocation, the function receives a cursor over an arbitrary group of purchases and a cursor over the entire set of stores.

  4. Perform a proximity join, using the normalized spatial coordinates to find the closest store at the time of each purchase.

    The bottom box in the preceding figure shows the function output.