| |
- hashamp(arg_expression)
- DESCRIPTION:
Function finds the primary AMP corresponding to the hash bucket number specified
in the arg_expression and returns the AMP ID. If no hash bucket number is specified,
function returns one less than the maximum number of AMPs in the system.
PARAMETERS:
arg_expression:
Optional Argument.
Specifies an expression that evaluates to a valid hash bucket number.
Supported Types/Values:
* ColumnExpression, i.e., SQLAlchemy Column -
Format for the argument: '<dataframe>.<dataframe_column>.expression'.
* A Literal Value.
* SQLAlchemy TextClause
Specifies a TextClause object that is generated by passing a string
to 'text()' function from SQLAlchemy.
Following are the supported String Formats that can be passed to
text() function:
* Contiguous Map:
Format: MAP = Contiguousmap_name
For example,
text("MAP = " + Contiguousmap_name)
* ColumnExpression with Contiguous Map:
Format: ColumnExpression/literal value MAP = Contiguousmap_name
For example,
text(str(ColumnExpression) + " MAP = " + Contiguousmap_name)
* ColumnExpression with Sparse Map:
Format: ColumnExpression MAP = sparsemap_name COLOCATE USING = colocation_name
For example,
text(str(ColumnExpression) + " MAP = " + sparsemap_name + " COLOCATE USING = " + colocation_name)
where,
MAP - Specifies an object that specifies which AMPs store the rows of a table.
Contiguousmap_name - Specifies the name of the contiguous map, the map that
includes all AMPs within a specified range.
sparsemap_name - Specifies the name of the sparse map, the map that includes a
subset of AMPs from a contiguous map.
COLOCATE USING - Specifies a clause that forces tables that use the same sparse
map to be stored on the same subset of AMPs.
COLOCATE USING is required with a sparse map. It cannot be used
with a contiguous map.
colocation_name - Specifies the colocation name, usually databasename_tablename.
NOTES:
1. Function accepts positional arguments only.
2. If no argument is passed to hashamp(), function returns an INTEGER that is one less than the
maximum number of AMPs in the system.
3. If ColumnExpression is not specified:
a. For a contiguous map: The function returns an INTEGER representing the highest AMP number
in the specified or default contiguous map. For a contiguous map
starting at AMP zero, adding one to the result gives the total number
of AMPs in the contiguous map.
b. For a sparse map: Expression must be specified for a sparse map.
4. If ColumnExpression is specified, the function returns the ID of the primary AMP corresponding
to the hash bucket number specified in ColumnExpression, based on the specified or default map.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Import func from sqlalchemy to execute hashamp function.
>>> from sqlalchemy import func
# Example 1: Executing hashamp() function without argument to return primary AMP number.
# Create a sqlalchemy Function object.
>>> hashamp_func_ = func.hashamp()
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(hashamp_=hashamp_func_)
>>> print(df)
masters gpa stats programming admitted hashamp_
id
13 no 4.00 Advanced Novice 1 3
26 yes 3.57 Advanced Advanced 1 3
5 no 3.44 Novice Novice 0 3
19 yes 1.98 Advanced Advanced 0 3
15 yes 4.00 Advanced Advanced 1 3
40 yes 3.95 Novice Beginner 0 3
7 yes 2.33 Novice Novice 1 3
22 yes 3.46 Novice Beginner 0 3
36 no 3.00 Advanced Novice 0 3
38 yes 2.65 Advanced Beginner 1 3
>>>
# Example 2: Executing hashamp() function to return primary AMP number corresponding
# to values in "id" column.
# Create a sqlalchemy Function object.
>>> hashamp_func_ = func.hashamp(admissions_train.id.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(hashamp_id_=hashamp_func_)
>>> print(df)
masters gpa stats programming admitted hashamp_id_
id
5 no 3.44 Novice Novice 0 0
7 yes 2.33 Novice Novice 1 2
22 yes 3.46 Novice Beginner 0 1
19 yes 1.98 Advanced Advanced 0 2
15 yes 4.00 Advanced Advanced 1 3
17 no 3.83 Advanced Advanced 1 0
34 yes 3.85 Advanced Beginner 0 1
13 no 4.00 Advanced Novice 1 2
36 no 3.00 Advanced Novice 0 3
40 yes 3.95 Novice Beginner 0 1
>>>
# Example 3: Executing hashamp() function to return primary AMP number using
# contiguous map.
# To use contiguous map, we must import 'text' from sqlalchemy.
>>> from sqlalchemy import text
# Create a sqlalchemy Function object.
>>> hashamp_func_ = func.hashamp(text("MAP = TD_Map1"))
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(hashamp_contiguous_map_=hashamp_func_)
>>> print(df)
masters gpa stats programming admitted hashamp_contiguous_map_
id
13 no 4.00 Advanced Novice 1 3
26 yes 3.57 Advanced Advanced 1 3
5 no 3.44 Novice Novice 0 3
19 yes 1.98 Advanced Advanced 0 3
15 yes 4.00 Advanced Advanced 1 3
40 yes 3.95 Novice Beginner 0 3
7 yes 2.33 Novice Novice 1 3
22 yes 3.46 Novice Beginner 0 3
36 no 3.00 Advanced Novice 0 3
38 yes 2.65 Advanced Beginner 1 3
>>>
# Example 4: Executing hashamp() function to return primary AMP number using
# contiguous map and using values from "id" column.
# To use contiguous map, we must import 'text' from sqlalchemy.
>>> from sqlalchemy import text
# Create a sqlalchemy Function object.
>>> hashamp_func_ = func.hashamp(text(str(admissions_train.id.expression) + " MAP = TD_Map1"))
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(hashamp_id_col_contiguous_map_=hashamp_func_)
>>> print(df)
masters gpa stats programming admitted hashamp_id_col_contiguous_map_
id
13 no 4.00 Advanced Novice 1 2
26 yes 3.57 Advanced Advanced 1 3
5 no 3.44 Novice Novice 0 0
19 yes 1.98 Advanced Advanced 0 2
15 yes 4.00 Advanced Advanced 1 3
40 yes 3.95 Novice Beginner 0 1
7 yes 2.33 Novice Novice 1 2
22 yes 3.46 Novice Beginner 0 1
36 no 3.00 Advanced Novice 0 3
38 yes 2.65 Advanced Beginner 1 3
>>>
|