Use the set_index() function to assign an appropriate index to a teradataml DataFrame.
This method does not support operations on array columns.
Required Parameter
- keys
Used to assign one or more existing columns as the new index to a teradataml DataFrame. The argument can be a single column name or a list of column names.
Optional Parameters
- append
- Specifies whether or not to append requested columns to the existing index.
- When append is False, replaces existing index.
- When append is True, retains both existing and currently appended index.
Default value: False
- drop
- Specifies whether or not to display the columns being set as index asteradataml DataFrame columns anymore.
- When drop is True, columns are set as index and not displayed as columns.
- When drop is False, columns are set as index; but also displayed as columns.
When the drop argument is set to True, the column being set as index does not cease to be a part of the underlying table upon which the teradataml DataFrame is based off. A column that is dropped while being set as an index is merely not used for display purposes anymore as a column of the teradataml DataFrame.Default value: True
Example setup
Assume a teradataml DataFrame is created based on the table "df_admissions_train".
>>> df1 = DataFrame.from_table('df_admissions_train')
>>> df1
id masters gpa stats programming admitted
0 26 yes 3.57 advanced advanced 1
1 34 yes 3.85 advanced beginner 0
2 40 yes 3.95 novice beginner 0
3 14 yes 3.45 advanced advanced 0
4 29 yes 4.0 novice beginner 0
5 6 yes 3.5 beginner advanced 1
6 36 no 3.0 advanced novice 0
7 32 yes 3.46 advanced beginner 0
8 5 no 3.44 novice novice 0
Example 1: Assign a single index column 'id' as the index
This example assigns a single index column 'id' as the index to a teradataml DataFrame without an index created from the 'admissions_train' table.
>>> df2 = df1.set_index("id")
>>> df2
masters gpa stats programming admitted
id
26 yes 3.57 advanced advanced 1
34 yes 3.85 advanced beginner 0
40 yes 3.95 novice beginner 0
14 yes 3.45 advanced advanced 0
29 yes 4.0 novice beginner 0
6 yes 3.5 beginner advanced 1
36 no 3.0 advanced novice 0
32 yes 3.46 advanced beginner 0
5 no 3.44 novice novice 0
Example 2: Assign a multicolumn index using a list of columns names
This examples uses a list of columns names to assign a multicolumn index.
>>> df3 = df1.set_index(["id", "masters"])
>>> df3
gpa stats programming admitted
id masters
26 yes 3.57 advanced advanced 1
34 yes 3.85 advanced beginner 0
40 yes 3.95 novice beginner 0
14 yes 3.45 advanced advanced 0
29 yes 4.0 novice beginner 0
6 yes 3.5 beginner advanced 1
36 no 3.0 advanced novice 0
32 yes 3.46 advanced beginner 0
5 no 3.44 novice novice 0
Example 3: Add an additional column as an index
This example add an additional column as an index to a teradataml DataFrame that already has an index.
>>> df4 = df3.set_index("gpa", append = True, drop = True)
>>> df4
stats programming admitted
id masters gpa
26 yes 3.57 advanced advanced 1
34 yes 3.85 advanced beginner 0
40 yes 3.95 novice beginner 0
14 yes 3.45 advanced advanced 0
29 yes 4.0 novice beginner 0
6 yes 3.5 beginner advanced 1
36 no 3.0 advanced novice 0
32 yes 3.46 advanced beginner 0
5 no 3.44 novice novice 0
Example 4: Display an assigned index as a column of a DataFrame
This example displays an assigned index as a column of a teradataml DataFrame after assigning it as an index.
>>> df5 = df3.set_index("gpa", append = True, drop = False)
>>> df5
gpa stats programming admitted
id masters gpa
26 yes 3.57 3.57 advanced advanced 1
34 yes 3.85 3.85 advanced beginner 0
40 yes 3.95 3.95 novice beginner 0
14 yes 3.45 3.45 advanced advanced 0
29 yes 4.0 4.0 novice beginner 0
6 yes 3.5 3.5 beginner advanced 1
36 no 3.0 3.0 advanced novice 0
32 yes 3.46 3.46 advanced beginner 0
5 no 3.44 3.44 novice novice 0