Teradata Package for Python Function Reference | 17.10 - select - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.
Teradata® Package for Python Function Reference
- Product
- Teradata Package for Python
- Release Number
- 17.10
- Published
- April 2022
- Language
- English (United States)
- Last Update
- 2022-08-19
- lifecycle
- previous
- Product Category
- Teradata Vantage
- teradataml.dataframe.dataframe.DataFrame.select = select(self, select_expression)
- DESCRIPTION:
Select required columns from DataFrame using an expression.
Returns a new teradataml DataFrame with selected columns only.
PARAMETERS:
select_expression:
Required Argument.
String or List representing columns to select.
Types: str OR List of Strings (str)
The following formats (only) are supported for select_expression:
A] Single Column String: df.select("col1")
B] Single Column List: df.select(["col1"])
C] Multi-Column List: df.select(['col1', 'col2', 'col3'])
D] Multi-Column List of List: df.select([["col1", "col2", "col3"]])
Column Names ("col1", "col2"..) are Strings representing Teradata Vantage table Columns.
All Standard Teradata data types for columns supported: INTEGER, VARCHAR(5), FLOAT.
Note: Multi-Column selection of the same column such as df.select(['col1', 'col1']) is not supported.
RETURNS:
teradataml DataFrame
RAISES:
TeradataMlException (TDMLDF_SELECT_INVALID_COLUMN, TDMLDF_SELECT_INVALID_FORMAT,
TDMLDF_SELECT_DF_FAIL, TDMLDF_SELECT_EXPR_UNSPECIFIED,
TDMLDF_SELECT_NONE_OR_EMPTY)
EXAMPLES:
>>> load_example_data("dataframe","admissions_train")
>>> df = DataFrame('admissions_train')
>>> df
masters gpa stats programming admitted
id
5 no 3.44 Novice Novice 0
7 yes 2.33 Novice Novice 1
22 yes 3.46 Novice Beginner 0
17 no 3.83 Advanced Advanced 1
13 no 4.00 Advanced Novice 1
19 yes 1.98 Advanced Advanced 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
40 yes 3.95 Novice Beginner 0
A] Single String Column
>>> df.select("id")
Empty DataFrame
Columns: []
Index: [22, 34, 13, 19, 15, 38, 26, 5, 36, 17]
B] Single Column List
>>> df.select(["id"])
Empty DataFrame
Columns: []
Index: [15, 26, 5, 40, 22, 17, 34, 13, 7, 38]
C] Multi-Column List
>>> df.select(["id", "masters", "gpa"])
masters gpa
id
5 no 3.44
36 no 3.00
15 yes 4.00
17 no 3.83
13 no 4.00
40 yes 3.95
7 yes 2.33
22 yes 3.46
34 yes 3.85
19 yes 1.98
D] Multi-Column List of List
>>> df.select([['id', 'masters', 'gpa']])
masters gpa
id
5 no 3.44
34 yes 3.85
13 no 4.00
40 yes 3.95
22 yes 3.46
19 yes 1.98
36 no 3.00
15 yes 4.00
7 yes 2.33
17 no 3.83