Teradata Python Package Function Reference - assign - Teradata Python Package - Look here for syntax, methods and examples for the functions included in the Teradata Python Package.
Teradata® Python Package Function Reference
- Product
- Teradata Python Package
- Release Number
- 16.20
- Published
- February 2020
- Language
- English (United States)
- Last Update
- 2020-07-17
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.assign = assign(self, drop_columns=False, **kwargs)
- DESCRIPTION:
Assign new columns to a teradataml DataFrame
PARAMETERS:
drop_columns:
Optional Argument.
If True, drop columns that are not specified in assign.
Default Value: False
Types: bool
kwargs: keyword, value pairs
- keywords are the column names.
- values can be column arithmetic expressions and int/float/string literals.
RETURNS:
teradataml DataFrame
A new DataFrame with the new columns in addition to
all the existing columns if drop_columns is equal to False.
Otherwise, if drop_columns = True, a new DataFrame with only columns in kwargs.
NOTES:
- The values in kwargs cannot be callable (functions).
- The original DataFrame is not modified.
- Since ``kwargs`` is a dictionary, the order of your
arguments may not be preserved. To make things predicatable,
the columns are inserted in alphabetical order, at the end of
your DataFrame. Assigning multiple columns within the same
``assign`` is possible, but you cannot reference other columns
created within the same ``assign`` call.
- The maximum number of columns in a DataFrame is 2048.
RAISES:
1. ValueError - When a value that is callable is given in kwargs.
2. ValueError - When columns of different dataframes are given in ColumnExpression.
3. TeradataMlException - When there is an internal error in DataFrame or invalid
argument type.
EXAMPLES:
>>> load_example_data("dataframe", "admissions_train")
>>> df = DataFrame("admissions_train")
>>> c1 = df.gpa
>>> c2 = df.id
>>>
>>> df.assign(new_column = c1 + c2).sort("id")
masters gpa stats programming admitted new_column
id
1 yes 3.95 Beginner Beginner 0 4.95
2 yes 3.76 Beginner Beginner 0 5.76
3 no 3.70 Novice Beginner 1 6.70
4 yes 3.50 Beginner Novice 1 7.50
5 no 3.44 Novice Novice 0 8.44
6 yes 3.50 Beginner Advanced 1 9.50
7 yes 2.33 Novice Novice 1 9.33
8 no 3.60 Beginner Advanced 1 11.60
9 no 3.82 Advanced Advanced 1 12.82
10 no 3.71 Advanced Advanced 1 13.71
>>> df.assign(new_column = c1 * c2).sort("id")
masters gpa stats programming admitted new_column
id
1 yes 3.95 Beginner Beginner 0 3.95
2 yes 3.76 Beginner Beginner 0 7.52
3 no 3.70 Novice Beginner 1 11.10
4 yes 3.50 Beginner Novice 1 14.00
5 no 3.44 Novice Novice 0 17.20
6 yes 3.50 Beginner Advanced 1 21.00
7 yes 2.33 Novice Novice 1 16.31
8 no 3.60 Beginner Advanced 1 28.80
9 no 3.82 Advanced Advanced 1 34.38
10 no 3.71 Advanced Advanced 1 37.10
>>> df.assign(new_column = c2 / c1).sort("id")
masters gpa stats programming admitted new_column
id
1 yes 3.95 Beginner Beginner 0 0.253165
2 yes 3.76 Beginner Beginner 0 0.531915
3 no 3.70 Novice Beginner 1 0.810811
4 yes 3.50 Beginner Novice 1 1.142857
5 no 3.44 Novice Novice 0 1.453488
6 yes 3.50 Beginner Advanced 1 1.714286
7 yes 2.33 Novice Novice 1 3.004292
8 no 3.60 Beginner Advanced 1 2.222222
9 no 3.82 Advanced Advanced 1 2.356021
10 no 3.71 Advanced Advanced 1 2.695418
>>> df.assign(new_column = c1 - c2).sort("id")
masters gpa stats programming admitted new_column
id
1 yes 3.95 Beginner Beginner 0 2.95
2 yes 3.76 Beginner Beginner 0 1.76
3 no 3.70 Novice Beginner 1 0.70
4 yes 3.50 Beginner Novice 1 -0.50
5 no 3.44 Novice Novice 0 -1.56
6 yes 3.50 Beginner Advanced 1 -2.50
7 yes 2.33 Novice Novice 1 -4.67
8 no 3.60 Beginner Advanced 1 -4.40
9 no 3.82 Advanced Advanced 1 -5.18
10 no 3.71 Advanced Advanced 1 -6.29
>>> df.assign(new_column = c2 % c1).sort("id")
masters gpa stats programming admitted new_column
id
1 yes 3.95 Beginner Beginner 0 1.00
2 yes 3.76 Beginner Beginner 0 2.00
3 no 3.70 Novice Beginner 1 3.00
4 yes 3.50 Beginner Novice 1 0.50
5 no 3.44 Novice Novice 0 1.56
6 yes 3.50 Beginner Advanced 1 2.50
7 yes 2.33 Novice Novice 1 0.01
8 no 3.60 Beginner Advanced 1 0.80
9 no 3.82 Advanced Advanced 1 1.36
10 no 3.71 Advanced Advanced 1 2.58
>>>
>>> df.assign(c1 = c2, c2 = c1).sort("id")
masters gpa stats programming admitted c1 c2
id
1 yes 3.95 Beginner Beginner 0 1 3.95
2 yes 3.76 Beginner Beginner 0 2 3.76
3 no 3.70 Novice Beginner 1 3 3.70
4 yes 3.50 Beginner Novice 1 4 3.50
5 no 3.44 Novice Novice 0 5 3.44
6 yes 3.50 Beginner Advanced 1 6 3.50
7 yes 2.33 Novice Novice 1 7 2.33
8 no 3.60 Beginner Advanced 1 8 3.60
9 no 3.82 Advanced Advanced 1 9 3.82
10 no 3.71 Advanced Advanced 1 10 3.71
>>> df.assign(c3 = c1 + 1, c4 = c2 + 1).sort("id")
masters gpa stats programming admitted c3 c4
id
1 yes 3.95 Beginner Beginner 0 4.95 2
2 yes 3.76 Beginner Beginner 0 4.76 3
3 no 3.70 Novice Beginner 1 4.70 4
4 yes 3.50 Beginner Novice 1 4.50 5
5 no 3.44 Novice Novice 0 4.44 6
6 yes 3.50 Beginner Advanced 1 4.50 7
7 yes 2.33 Novice Novice 1 3.33 8
8 no 3.60 Beginner Advanced 1 4.60 9
9 no 3.82 Advanced Advanced 1 4.82 10
10 no 3.71 Advanced Advanced 1 4.71 11
>>>
>>> df.assign(c1 = 1).sort("id")
masters gpa stats programming admitted c1
id
1 yes 3.95 Beginner Beginner 0 1
2 yes 3.76 Beginner Beginner 0 1
3 no 3.70 Novice Beginner 1 1
4 yes 3.50 Beginner Novice 1 1
5 no 3.44 Novice Novice 0 1
6 yes 3.50 Beginner Advanced 1 1
7 yes 2.33 Novice Novice 1 1
8 no 3.60 Beginner Advanced 1 1
9 no 3.82 Advanced Advanced 1 1
10 no 3.71 Advanced Advanced 1 1
>>> df.assign(c3 = 'string').sort("id")
masters gpa stats programming admitted c3
id
1 yes 3.95 Beginner Beginner 0 string
2 yes 3.76 Beginner Beginner 0 string
3 no 3.70 Novice Beginner 1 string
4 yes 3.50 Beginner Novice 1 string
5 no 3.44 Novice Novice 0 string
6 yes 3.50 Beginner Advanced 1 string
7 yes 2.33 Novice Novice 1 string
8 no 3.60 Beginner Advanced 1 string
9 no 3.82 Advanced Advanced 1 string
10 no 3.71 Advanced Advanced 1 string
>>>
>>> # + op is overidden for string columns
... df.assign(concatenated = "Completed? " + df.masters).sort("id")
masters gpa stats programming admitted concatenated
id
1 yes 3.95 Beginner Beginner 0 Completed? yes
2 yes 3.76 Beginner Beginner 0 Completed? yes
3 no 3.70 Novice Beginner 1 Completed? no
4 yes 3.50 Beginner Novice 1 Completed? yes
5 no 3.44 Novice Novice 0 Completed? no
6 yes 3.50 Beginner Advanced 1 Completed? yes
7 yes 2.33 Novice Novice 1 Completed? yes
8 no 3.60 Beginner Advanced 1 Completed? no
9 no 3.82 Advanced Advanced 1 Completed? no
10 no 3.71 Advanced Advanced 1 Completed? no
>>>
>>> # setting drop_columns to True will only return assigned expressions
... df.assign(drop_columns = True, c1 = 1)
c1
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 1
8 1
9 1
>>>