assign() Method

Teradata® Python Package User Guide

brand
Teradata Vantage
prodname
Teradata Python Package
vrm_release
16.20
category
User Guide
featnum
B700-4006-098K

Use the assign() method to assign new column expressions in a DataFrame. A new DataFrame is returned without modifying the existing DataFrame.

assign(self, drop_columns = False, **kwargs)

The expressions are given as key value pairs where the keys are column names and the values are column expressions. The values can include arithmetic expressions that involve supported python literals (see section below) and columns (ColumnExpression instances) from the DataFrame.

When the drop_columns parameter is True, it removes columns from the resulting DataFrame if they are not specified in assign. It is False by default, so columns from the previous DataFrame are retained.

Supported Types and Operators

Python int, float, Decimal, str and None literals can be used in assign expressions. All arithmetic expressions except floor division (//) and power (**) are supported.

  • The values in kwargs is not callable for now.
  • Since kwargs is a dictionary, the order of your arguments may not be preserved. To make things predictable, the columns are inserted in alphabetical order, at the end of your DataFrame. Assigning multiple columns within the same assign() is possible, but you cannot reference other columns created within the same assign call.
  • If no kwargs are given, the function returns self.
  • The maximum number of columns in a DataFrame is 2048.

Examples Prerequisite

Assume the DataFrame module is imported using the command:

from teradataml import DataFrame

And assume a teradataml DataFrame "df" is created from a Teradata table "iris", using command:

df = DataFrame('iris')

Assign alias to the columns to use in assign method:

s_len = df.SepalLength
p_len = df.PetalLength

Example: Add new column expressions to DataFrame

df.select(['SepalLength', 'PetalLength']).\
   assign(sum  = s_len + p_len,
          diff = s_len - p_len,
          prod = s_len * p_len,
          div = s_len / p_len,
          mod = s_len % p_len,
          num_constant = 1,
          str_constant = 'string')

   SepalLength  PetalLength  diff       div  mod num_constant   prod str_constant   sum
0          5.1          1.5   3.6  3.400000  0.6            1   7.65       string   6.6
1          5.7          4.1   1.6  1.390244  1.6            1  23.37       string   9.8
2          5.5          1.3   4.2  4.230769  0.3            1   7.15       string   6.8
3          6.2          4.3   1.9  1.441860  1.9            1  26.66       string  10.5
4          6.4          5.5   0.9  1.163636  0.9            1  35.20       string  11.9
5          6.4          4.3   2.1  1.488372  2.1            1  27.52       string  10.7
6          5.7          1.5   4.2  3.800000  1.2            1   8.55       string   7.2
7          5.2          3.9   1.3  1.333333  1.3            1  20.28       string   9.1
8          7.7          6.7   1.0  1.149254  1.0            1  51.59       string  14.4
9          5.1          1.4   3.7  3.642857  0.9            1   7.14       string   6.5

Example: Keep only the columns specified

df.assign(drop_columns = True,
          sum  = s_len + p_len,
          diff = s_len - p_len,
          prod = s_len * p_len,
          div = s_len / p_len,
          mod = s_len % 2,
          num_constant = 1,
          str_constant = 'string'
)
 
 
   diff       div  mod num_constant   prod str_constant   sum
0   3.6  3.400000  1.1            1   7.65       string   6.6
1   1.6  1.390244  1.7            1  23.37       string   9.8
2   4.2  4.230769  1.5            1   7.15       string   6.8
3   1.9  1.441860  0.2            1  26.66       string  10.5
4   0.9  1.163636  0.4            1  35.20       string  11.9
5   2.1  1.488372  0.4            1  27.52       string  10.7
6   4.2  3.800000  1.7            1   8.55       string   7.2
7   1.3  1.333333  1.2            1  20.28       string   9.1
8   1.0  1.149254  1.7            1  51.59       string  14.4
9   3.7  3.642857  1.1            1   7.14       string   6.5