Teradata Python Package Function Reference - var - Teradata Python Package - Look here for syntax, methods and examples for the functions included in the Teradata Python Package.
Teradata® Python Package Function Reference
- Product
- Teradata Python Package
- Release Number
- 16.20
- Published
- February 2020
- Language
- English (United States)
- Last Update
- 2020-07-17
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.var = var(self)
- DESCRIPTION:
Returns column-wise unbiased variance value of the dataframe.
PARAMETERS:
None
RETURNS:
teradataml DataFrame object with var() operation performed.
RAISES:
1. TDMLDF_AGGREGATE_FAILED - If var() operation fails to
generate the column-wise variance of the dataframe.
Possible error message:
Unable to perform 'var()' on the dataframe.
2. TDMLDF_AGGREGATE_COMBINED_ERR - If the var() operation
doesn't support all the columns in the dataframe.
Possible error message:
No results. Below is/are the error message(s):
All selected columns [(col2 - PERIOD_TIME), (col3 -
BLOB)] is/are unsupported for 'var' operation.
EXAMPLES :
# Load the data to run the example.
>>> from teradataml.data.load_example_data import load_example_data
>>> load_example_data("dataframe", ["employee_info", "sales"])
# Example 1 - Applying var on table 'employee_info' that has all
# NULL values in marks and dob columns which are
# captured as None in variance dataframe.
# Create teradataml dataframe.
>>> df1 = DataFrame("employee_info")
>>> print(df1)
first_name marks dob joined_date
employee_no
101 abcde None None 02/12/05
100 abcd None None None
112 None None None 18/12/05
>>>
# Select only subset of columns from the DataFrame.
>>> df3 = df1.select(["employee_no", "first_name", "dob", "marks"])
# Prints unbiased variance of each column(with supported data types).
>>> df3.var()
var_employee_no var_dob var_marks
0 44.333333 None None
# Example 2 - Applying var on table 'sales' that has different
# types of data like floats, integers, strings
# some of which having NULL values which are ignored.
# Create teradataml dataframe.
>>> df1 = DataFrame("sales")
>>> print(df1)
Feb Jan Mar Apr datetime
accounts
Blue Inc 90.0 50 95 101 04/01/2017
Orange Inc 210.0 None None 250 04/01/2017
Red Inc 200.0 150 140 None 04/01/2017
Yellow Inc 90.0 None None None 04/01/2017
Jones LLC 200.0 150 140 180 04/01/2017
Alpha Co 210.0 200 215 250 04/01/2017
# Prints unbiased variance of each column(with supported data types).
>>> df3 = df1.select(["accounts","Feb","Jan","Mar","Apr"])
>>> df3.var()
var_Feb var_Jan var_Mar var_Apr
0 3546.666667 3958.333333 2475.0 5036.916667
>>>