Teradata Package for Python Function Reference on VantageCloud Lake - case - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference on VantageCloud Lake
- Deployment
- VantageCloud
- Edition
- Lake
- Product
- Teradata Package for Python
- Release Number
- 20.00.00.03
- Published
- December 2024
- ft:locale
- en-US
- ft:lastEdition
- 2024-12-19
- dita:id
- TeradataPython_FxRef_Lake_2000
- Product Category
- Teradata Vantage
- teradataml.dataframe.sql_functions.case = case(whens, value=None, else_=None)
- Returns a ColumnExpression based on the CASE expression.
PARAMETERS:
whens:
Required Argument.
Specifies the criteria to be compared against. It accepts two different forms,
based on whether or not the value argument is used.
In the first form, it accepts a list of 2-tuples; each 2-tuple consists of (<sql expression>, <value>),
where the <sql expression> is a boolean expression and “value” is a resulting value.
For example:
case([
(df.first_name == 'wendy', 'W'),
(df.first_name == 'jack', 'J')
])
In the second form, it accepts a Python dictionary of comparison values mapped to a resulting value;
this form requires 'value' argument to be present, and values will be compared using the '==' operator.
For example:
case(
{"wendy": "W", "jack": "J"},
value=df.first_name
)
Types: List of 2-tuples or Dictionary of comparison value mapped to a resulting value.
value:
Optional Argument. Required when 'whens' is of dictionary type.
Specifies a SQL expression (ColumnExpression or literal) which will be used as a fixed “comparison point”
for candidate values within a dictionary passed to the 'whens' argument.
Types: ColumnExpression or SQL Expression (Python literal)
else_:
Optional Argument.
Specifies a SQL expression (ColumnExpression or literal) which will be the evaluated result of
the CASE construct if all expressions within 'whens' evaluate to False.
When omitted, will produce a result of NULL if none of the 'when' expressions evaluate to True.
Types: ColumnExpression or SQL Expression (Python literal)
RETURNS:
ColumnExpression
EXAMPLES:
>>> from teradataml.dataframe.sql_functions import case
>>> load_example_data("GLM", ["admissions_train"])
>>> df = DataFrame("admissions_train")
>>> print(df)
masters gpa stats programming admitted
id
5 no 3.44 Novice Novice 0
3 no 3.70 Novice Beginner 1
1 yes 3.95 Beginner Beginner 0
20 yes 3.90 Advanced Advanced 1
8 no 3.60 Beginner Advanced 1
25 no 3.96 Advanced Advanced 1
18 yes 3.81 Advanced Advanced 1
24 no 1.87 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
>>> print(df.shape)
(40, 6)
>>> # Example showing 'whens' passed a 2-tuple - assign rating based on GPA
>>> # gpa > 3.0 = 'good'
>>> # 2.0 < gpa <= 3.0 = 'average'
>>> # gpa <= 2.0 = 'bad'
>>> # Filtering all the 'good' scores only.
>>> good_df = df[case([(df.gpa > 3.0, 'good'),
(df.gpa > 2.0, 'average')],
else_='bad') == 'good']
>>> print(good_df)
masters gpa stats programming admitted
id
13 no 4.00 Advanced Novice 1
11 no 3.13 Advanced Advanced 1
9 no 3.82 Advanced Advanced 1
26 yes 3.57 Advanced Advanced 1
3 no 3.70 Novice Beginner 1
1 yes 3.95 Beginner Beginner 0
20 yes 3.90 Advanced Advanced 1
18 yes 3.81 Advanced Advanced 1
5 no 3.44 Novice Novice 0
32 yes 3.46 Advanced Beginner 0
>>> print(good_df.shape)
(35, 6)
>>> # Use DataFrame.assign() to create a new column with the rating
>>> whens_df = df.assign(rating = case([(df.gpa > 3.0, 'good'),
(df.gpa > 2.0, 'average')],
else_='bad'))
>>> print(whens_df)
masters gpa stats programming admitted rating
id
5 no 3.44 Novice Novice 0 good
3 no 3.70 Novice Beginner 1 good
1 yes 3.95 Beginner Beginner 0 good
20 yes 3.90 Advanced Advanced 1 good
8 no 3.60 Beginner Advanced 1 good
25 no 3.96 Advanced Advanced 1 good
18 yes 3.81 Advanced Advanced 1 good
24 no 1.87 Advanced Novice 1 bad
26 yes 3.57 Advanced Advanced 1 good
38 yes 2.65 Advanced Beginner 1 average
>>> print(whens_df.shape)
(40, 7)
>>> # Example not specifying 'else_'
>>> no_else = df.assign(rating = case([(df.gpa > 3.0, 'good')]))
>>> print(no_else)
masters gpa stats programming admitted rating
id
5 no 3.44 Novice Novice 0 good
3 no 3.70 Novice Beginner 1 good
1 yes 3.95 Beginner Beginner 0 good
20 yes 3.90 Advanced Advanced 1 good
8 no 3.60 Beginner Advanced 1 good
25 no 3.96 Advanced Advanced 1 good
18 yes 3.81 Advanced Advanced 1 good
24 no 1.87 Advanced Novice 1 None
26 yes 3.57 Advanced Advanced 1 good
38 yes 2.65 Advanced Beginner 1 None
>>> print(no_else.shape)
(40, 7)
>>> # Example showing 'whens' passed a dictionary along with 'value'
>>> whens_value_df = df.assign(admitted_text = case({ 1 : "admitted", 0 : "not admitted"},
value=df.admitted,
else_="don't know"))
>>> print(whens_value_df)
masters gpa stats programming admitted admitted_text
id
13 no 4.00 Advanced Novice 1 admitted
11 no 3.13 Advanced Advanced 1 admitted
9 no 3.82 Advanced Advanced 1 admitted
28 no 3.93 Advanced Advanced 1 admitted
33 no 3.55 Novice Novice 1 admitted
10 no 3.71 Advanced Advanced 1 admitted
16 no 3.70 Advanced Advanced 1 admitted
32 yes 3.46 Advanced Beginner 0 not admitted
34 yes 3.85 Advanced Beginner 0 not admitted
17 no 3.83 Advanced Advanced 1 admitted
>>> print(whens_value_df.shape)
(40, 7)
>>> # Example showing how you can decide on projecting a column based on the value of expression.
>>> # In this example, you end up projecting values from column 'average_rating' if 2.0 < gpa <= 3.0,
>>> # and the values from column 'good_rating' when gpa > 3.0, naming the column 'ga_rating'.
>>> from sqlalchemy.sql import literal_column
>>> whens_new_df = df.assign(good_rating = case([(df.gpa > 3.0, 'good')]))
>>> whens_new_df = whens_new_df.assign(avg_rating = case([((whens_new_df.gpa > 2.0) & (whens_new_df.gpa <= 3.0),
'average')]))
>>> literal_df = whens_new_df.assign(ga_rating = case([(whens_new_df.gpa > 3.0, literal_column('good_rating')),
(whens_new_df.gpa > 2.0, literal_column('avg_rating'))]))
>>> print(literal_df)
masters gpa stats programming admitted good_rating avg_rating ga_rating
id
5 no 3.44 Novice Novice 0 good None good
3 no 3.70 Novice Beginner 1 good None good
1 yes 3.95 Beginner Beginner 0 good None good
20 yes 3.90 Advanced Advanced 1 good None good
8 no 3.60 Beginner Advanced 1 good None good
25 no 3.96 Advanced Advanced 1 good None good
18 yes 3.81 Advanced Advanced 1 good None good
24 no 1.87 Advanced Novice 1 None None None
26 yes 3.57 Advanced Advanced 1 good None good
38 yes 2.65 Advanced Beginner 1 None average average