Teradata Python Package Function Reference - assign - Teradata Python Package - Look here for syntax, methods and examples for the functions included in the Teradata Python Package.

teradataml.dataframe.dataframe.DataFrame.assign = assign(self, drop_columns=False, **kwargs): DESCRIPTION: Assign new columns to a teradataml DataFrame PARAMETERS: drop_columns: Optional Argument. If True, drop columns that are not specified in assign. Default Value: False Types: bool kwargs: keyword, value pairs - keywords are the column names. - values can be column arithmetic expressions and int/float/string literals. RETURNS: teradataml DataFrame A new DataFrame with the new columns in addition to all the existing columns if drop_columns is equal to False. Otherwise, if drop_columns = True, a new DataFrame with only columns in kwargs. NOTES: - The values in kwargs cannot be callable (functions). - The original DataFrame is not modified. - Since ``kwargs`` is a dictionary, the order of your arguments may not be preserved. To make things predicatable, the columns are inserted in alphabetical order, at the end of your DataFrame. Assigning multiple columns within the same ``assign`` is possible, but you cannot reference other columns created within the same ``assign`` call. - The maximum number of columns in a DataFrame is 2048. RAISES: 1. ValueError - When a value that is callable is given in kwargs. 2. ValueError - When columns of different dataframes are given in ColumnExpression. 3. TeradataMlException - When there is an internal error in DataFrame or invalid argument type. EXAMPLES: >>> load_example_data("dataframe", "admissions_train") >>> df = DataFrame("admissions_train") >>> c1 = df.gpa >>> c2 = df.id >>> >>> df.assign(new_column = c1 + c2).sort("id") masters gpa stats programming admitted new_column id 1 yes 3.95 Beginner Beginner 0 4.95 2 yes 3.76 Beginner Beginner 0 5.76 3 no 3.70 Novice Beginner 1 6.70 4 yes 3.50 Beginner Novice 1 7.50 5 no 3.44 Novice Novice 0 8.44 6 yes 3.50 Beginner Advanced 1 9.50 7 yes 2.33 Novice Novice 1 9.33 8 no 3.60 Beginner Advanced 1 11.60 9 no 3.82 Advanced Advanced 1 12.82 10 no 3.71 Advanced Advanced 1 13.71 >>> df.assign(new_column = c1 * c2).sort("id") masters gpa stats programming admitted new_column id 1 yes 3.95 Beginner Beginner 0 3.95 2 yes 3.76 Beginner Beginner 0 7.52 3 no 3.70 Novice Beginner 1 11.10 4 yes 3.50 Beginner Novice 1 14.00 5 no 3.44 Novice Novice 0 17.20 6 yes 3.50 Beginner Advanced 1 21.00 7 yes 2.33 Novice Novice 1 16.31 8 no 3.60 Beginner Advanced 1 28.80 9 no 3.82 Advanced Advanced 1 34.38 10 no 3.71 Advanced Advanced 1 37.10 >>> df.assign(new_column = c2 / c1).sort("id") masters gpa stats programming admitted new_column id 1 yes 3.95 Beginner Beginner 0 0.253165 2 yes 3.76 Beginner Beginner 0 0.531915 3 no 3.70 Novice Beginner 1 0.810811 4 yes 3.50 Beginner Novice 1 1.142857 5 no 3.44 Novice Novice 0 1.453488 6 yes 3.50 Beginner Advanced 1 1.714286 7 yes 2.33 Novice Novice 1 3.004292 8 no 3.60 Beginner Advanced 1 2.222222 9 no 3.82 Advanced Advanced 1 2.356021 10 no 3.71 Advanced Advanced 1 2.695418 >>> df.assign(new_column = c1 - c2).sort("id") masters gpa stats programming admitted new_column id 1 yes 3.95 Beginner Beginner 0 2.95 2 yes 3.76 Beginner Beginner 0 1.76 3 no 3.70 Novice Beginner 1 0.70 4 yes 3.50 Beginner Novice 1 -0.50 5 no 3.44 Novice Novice 0 -1.56 6 yes 3.50 Beginner Advanced 1 -2.50 7 yes 2.33 Novice Novice 1 -4.67 8 no 3.60 Beginner Advanced 1 -4.40 9 no 3.82 Advanced Advanced 1 -5.18 10 no 3.71 Advanced Advanced 1 -6.29 >>> df.assign(new_column = c2 % c1).sort("id") masters gpa stats programming admitted new_column id 1 yes 3.95 Beginner Beginner 0 1.00 2 yes 3.76 Beginner Beginner 0 2.00 3 no 3.70 Novice Beginner 1 3.00 4 yes 3.50 Beginner Novice 1 0.50 5 no 3.44 Novice Novice 0 1.56 6 yes 3.50 Beginner Advanced 1 2.50 7 yes 2.33 Novice Novice 1 0.01 8 no 3.60 Beginner Advanced 1 0.80 9 no 3.82 Advanced Advanced 1 1.36 10 no 3.71 Advanced Advanced 1 2.58 >>> >>> df.assign(c1 = c2, c2 = c1).sort("id") masters gpa stats programming admitted c1 c2 id 1 yes 3.95 Beginner Beginner 0 1 3.95 2 yes 3.76 Beginner Beginner 0 2 3.76 3 no 3.70 Novice Beginner 1 3 3.70 4 yes 3.50 Beginner Novice 1 4 3.50 5 no 3.44 Novice Novice 0 5 3.44 6 yes 3.50 Beginner Advanced 1 6 3.50 7 yes 2.33 Novice Novice 1 7 2.33 8 no 3.60 Beginner Advanced 1 8 3.60 9 no 3.82 Advanced Advanced 1 9 3.82 10 no 3.71 Advanced Advanced 1 10 3.71 >>> df.assign(c3 = c1 + 1, c4 = c2 + 1).sort("id") masters gpa stats programming admitted c3 c4 id 1 yes 3.95 Beginner Beginner 0 4.95 2 2 yes 3.76 Beginner Beginner 0 4.76 3 3 no 3.70 Novice Beginner 1 4.70 4 4 yes 3.50 Beginner Novice 1 4.50 5 5 no 3.44 Novice Novice 0 4.44 6 6 yes 3.50 Beginner Advanced 1 4.50 7 7 yes 2.33 Novice Novice 1 3.33 8 8 no 3.60 Beginner Advanced 1 4.60 9 9 no 3.82 Advanced Advanced 1 4.82 10 10 no 3.71 Advanced Advanced 1 4.71 11 >>> >>> df.assign(c1 = 1).sort("id") masters gpa stats programming admitted c1 id 1 yes 3.95 Beginner Beginner 0 1 2 yes 3.76 Beginner Beginner 0 1 3 no 3.70 Novice Beginner 1 1 4 yes 3.50 Beginner Novice 1 1 5 no 3.44 Novice Novice 0 1 6 yes 3.50 Beginner Advanced 1 1 7 yes 2.33 Novice Novice 1 1 8 no 3.60 Beginner Advanced 1 1 9 no 3.82 Advanced Advanced 1 1 10 no 3.71 Advanced Advanced 1 1 >>> df.assign(c3 = 'string').sort("id") masters gpa stats programming admitted c3 id 1 yes 3.95 Beginner Beginner 0 string 2 yes 3.76 Beginner Beginner 0 string 3 no 3.70 Novice Beginner 1 string 4 yes 3.50 Beginner Novice 1 string 5 no 3.44 Novice Novice 0 string 6 yes 3.50 Beginner Advanced 1 string 7 yes 2.33 Novice Novice 1 string 8 no 3.60 Beginner Advanced 1 string 9 no 3.82 Advanced Advanced 1 string 10 no 3.71 Advanced Advanced 1 string >>> >>> # + op is overidden for string columns ... df.assign(concatenated = "Completed? " + df.masters).sort("id") masters gpa stats programming admitted concatenated id 1 yes 3.95 Beginner Beginner 0 Completed? yes 2 yes 3.76 Beginner Beginner 0 Completed? yes 3 no 3.70 Novice Beginner 1 Completed? no 4 yes 3.50 Beginner Novice 1 Completed? yes 5 no 3.44 Novice Novice 0 Completed? no 6 yes 3.50 Beginner Advanced 1 Completed? yes 7 yes 2.33 Novice Novice 1 Completed? yes 8 no 3.60 Beginner Advanced 1 Completed? no 9 no 3.82 Advanced Advanced 1 Completed? no 10 no 3.71 Advanced Advanced 1 Completed? no >>> >>> # setting drop_columns to True will only return assigned expressions ... df.assign(drop_columns = True, c1 = 1) c1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 >>>