Types of SQL-MapReduce Inputs

Types of SQL-MapReduce Inputs - Aster Analytics

Teradata Aster Analytics Foundation User Guide

Product

Aster Analytics

Release Number

6.21

Published

November 2016

Language

English (United States)

Last Update

2018-04-14

dita:mapPath

kiu1466024880662.ditamap

dita:ditavalPath

AA-notempfilter_pdf_output.ditaval

dita:id

B700-1021

lifecycle

Product Category

Software

Semantically, a SQL-MapReduce input is either partitioned or dimensional.

Partitioned Input

A partitioned input is partitioned among vworkers as specified in the PARTITION BY clause, which is either of the following:

PARTITION BY ANY
The input is randomly partitioned among the vworkers. PARTITION BY ANY preserves any existing partitioning of the data for that input. A function can have at most one PARTITION BY ANY input.
PARTITION BY p_attribute_set
The input is sorted and partitioned on the columns specified by p_attribute_set.

A function can have multiple PARTITION BY p_attribute_set inputs. All PARTITION BY p_attribute_set clauses must specify the same number of attributes, and corresponding attributes must be equijoin-compatible (that is, either of the same data type or of data types that can be implicitly cast to match). This casting is partition safe, which means that it does not cause redistribution of data on the vworkers.

Dimensional Input

A dimensional input, identified by the keyword DIMENSION, is distributed to each vworker. Dimensional inputs must be on each vworker because, like function arguments, they provide information that the function needs. The most common dimensional inputs are lookup tables and trained models.