cube() | DataFrame Manipulation | Teradata Package for Python - cube() Function - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-11-06
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Use the cube() function to create a multidimensional cube for a teradataml DataFrame using specified columns, and there by running aggregates on it produce the aggregations on different dimensions.

Required Parameter

columns
Specifies the names of input teradataml DataFrame columns.

Optional Parameter

include_grouping_columns
Specifies whether to include aggregations on the grouping columns or not. When set to True, the resultant DataFrame will have the aggregations on the columns mentioned in "columns_expr". Otherwise, resultant DataFrame will not have aggregations on the columns mentioned in "columns_expr".

Default value: False

Example Setup

In this example, "admission_train" dataset is used.

>>> from teradataml import *
>>> load_example_data("dataframe", "admissions_train")
>>> df = DataFrame("admissions_train")

Example 1: Analyzes the data by grouping into masters and stats dimensions

>>> df1 = df.cube(["masters", "stats"]).sum()
>>> df1
  masters     stats  sum_id  sum_gpa  sum_admitted
0      no  Beginner       8     3.60             1
1    None  Advanced     555    84.21            16
2    None  Beginner      21    18.31             3
3     yes  Beginner      13    14.71             2
4    None      None     820   141.67            26
5     yes  Advanced     366    49.26             7
6      no      None     343    63.96            16
7    None    Novice     244    39.15             7
8      no  Advanced     189    34.95             9
9     yes    Novice      98    13.74             1

Example 2: Find the average of all valid columns by grouping the DataFrame with columns 'masters' and 'admitted'

Include grouping columns in aggregate function 'avg'.

>>> df1 = df.cube(["masters", "admitted"], include_grouping_columns=True).avg()
>>> df1
    masters admitted    avg_id  avg_gpa avg_admitted
0       yes      NaN 21.681818 3.532273     0.454545
1      None      1.0 18.846154 3.533462     1.000000
2       no       NaN 19.055556 3.553333     0.888889
3      yes       0.0 24.083333 3.613333     0.000000
4     None       NaN 20.500000 3.541750     0.650000
5     None       0.0 23.571429 3.557143     0.000000
6      yes       1.0 18.800000 3.435000     1.000000
7       no       1.0 18.875000 3.595000     1.000000
8       no       0.0 20.500000 3.220000     0.000000
>>>

Example 3: Find the average of all valid columns by grouping the DataFrame with columns 'masters' and 'admitted'

Do not include grouping columns in aggregate function 'avg'.

>>> df1 = df.cube(["masters", "admitted"], include_grouping_columns=False).avg()
>>> df1
   masters admitted    avg_id  avg_gpa
0       no      0.0 20.500000 3.220000
1     None      1.0 18.846154 3.533462
2       no      NaN 19.055556 3.553333
3      yes      0.0 24.083333 3.613333
4     None      NaN 20.500000 3.541750
5     None      0.0 23.571429 3.557143
6      yes      1.0 18.800000 3.435000
7      yes      NaN 21.681818 3.532273
8       no      1.0 18.875000 3.595000
>>>