set_index() Method

Teradata® Python Package User Guide

brand
Teradata Vantage
prodname
Teradata Python Package
vrm_release
16.20
category
User Guide
featnum
B700-4006-098K

Use the set_index() function to assign an appropriate index to a teradataml DataFrame.

The keys parameter is used to assign one or more existing columns as the new index to a teradataml DataFrame. The argument can be a single column name or a list of column names.

The function also takes the following optional parameters:
  • append: Allows the user to specify whether or not to append requested columns to an already existing index (if any).
  • drop: Allows the user to specify whether or not to display the requested index being assigned as a column of the teradataml DataFrame.

Examples Prerequisite

Assume a teradataml DataFrame is created based on the table "df_admissions_train".
>>> df1 = DataFrame.from_table('df_admissions_train')

>>> df1

  id masters gpa  stats    programming admitted
0 26 yes     3.57 advanced advanced    1
1 34 yes     3.85 advanced beginner    0
2 40 yes     3.95 novice   beginner    0
3 14 yes     3.45 advanced advanced    0
4 29 yes     4.0  novice   beginner    0
5 6  yes     3.5  beginner advanced    1
6 36 no      3.0  advanced novice      0
7 32 yes     3.46 advanced beginner    0
8 5  no      3.44 novice   novice      0

Example: Assign a single index column 'id' as the index

This example assigns a single index column 'id' as the index to a teradataml DataFrame without an index created from the 'admissions_train' table.
>>> df2 = df1.set_index("id")

>>> df2
 
   masters gpa  stats    programming admitted
id
26 yes     3.57 advanced advanced    1
34 yes     3.85 advanced beginner    0
40 yes     3.95 novice   beginner    0
14 yes     3.45 advanced advanced    0
29 yes     4.0  novice   beginner    0
6  yes     3.5  beginner advanced    1
36 no      3.0  advanced novice      0
32 yes     3.46 advanced beginner    0
5  no      3.44 novice   novice      0

Example: Assign a multicolumn index using a list of columns names

This examples uses a list of columns names to assign a multicolumn index.
>>> df3 = df1.set_index(["id", "masters"])
>>> df3
 
           gpa  stats    programming admitted
id masters
26 yes     3.57 advanced advanced    1
34 yes     3.85 advanced beginner    0
40 yes     3.95 novice   beginner    0
14 yes     3.45 advanced advanced    0
29 yes     4.0  novice   beginner    0
6  yes     3.5  beginner advanced    1
36 no      3.0  advanced novice      0
32 yes     3.46 advanced beginner    0
5  no      3.44 novice   novice      0

Example: Add an additional column as an index

This example add an additional column as an index to a teradataml DataFrame that already has an index.
>>> df4 = df3.set_index("gpa", append = True, drop = True)
>>> df4
                stats    programming admitted
id masters gpa
26 yes     3.57 advanced advanced    1
34 yes     3.85 advanced beginner    0
40 yes     3.95 novice   beginner    0
14 yes     3.45 advanced advanced    0
29 yes     4.0  novice   beginner    0
6  yes     3.5  beginner advanced    1
36 no      3.0  advanced novice      0
32 yes     3.46 advanced beginner    0
5  no      3.44 novice   novice      0

Example: Display an assigned index as a column of a DataFrame

This example displays an assigned index as a column of a teradataml DataFrame after assigning it as an index.
>>> df5 = df3.set_index("gpa", append = True, drop = False)
>>> df5
                gpa  stats    programming admitted
id masters gpa
26 yes     3.57 3.57 advanced advanced    1
34 yes     3.85 3.85 advanced beginner    0
40 yes     3.95 3.95 novice   beginner    0
14 yes     3.45 3.45 advanced advanced    0
29 yes     4.0  4.0  novice   beginner    0
6  yes     3.5  3.5  beginner advanced    1
36 no      3.0  3.0  advanced novice      0
32 yes     3.46 3.46 advanced beginner    0
5  no      3.44 3.44 novice   novice      0