index[] Operator | Teradata Python Package - 17.00 - index[] Operator - Teradata Package for Python

Teradata® Package for Python User Guide

Product
Teradata Package for Python
Release Number
17.00
Release Date
November 2021
Content Type
User Guide
Publication ID
B700-4006-070K
Language
English (United States)

Use the index operator ([ ]) to filter rows based on logical expressions given by Python literals and DataFrame columns.

Refer to teradataml DataFrame Column for more details about ColumnExpressions in teradataml.

Examples Prerequisite

>>> df = DataFrame('iris_test')
>>> df
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> s_len = df.sepal_length
>>> p_len = df.petal_length

Example: Inequality operations

The operators supported are <, <=, >, >=.

>>> df[s_len > p_len]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> df[s_len < p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
105           6.5          3.0           5.8          2.2       3
85            5.4          3.0           4.5          1.5       2
115           5.8          2.8           5.1          2.4       3
150           5.9          3.0           5.1          1.8       3
135           6.1          2.6           5.6          1.4       3

Example: Equality operations

The operators supported are == and !=.

>>> df[s_len == p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[s_len != p_len + 2]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
20            5.1          3.8           1.5          0.3       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
15            5.8          4.0           1.2          0.2       1
5             5.0          3.6           1.4          0.2       1
80            5.7          2.6           3.5          1.0       2
40            5.1          3.4           1.5          0.2       1

Example: Boolean operations

The operators supported are &, |, ~.

Do not use 'and' operator, instead use '&' operator. Using 'and' boolean operator instead of '&' will return incorrect results.

For example, if condition used is as length == 4.0 and width == 3.5, the first condition in above case (length == 4.0), in 'and' condition is completely ignored and data satisfying only second condition (width == 3.5) will be returned.

Do not use 'or' operator, instead use '|' operator. Using 'or' boolean operator instead of '|' will return incorrect results.

For example, if condition used is as length == 4.0 or width == 3.5, the second condition in above case (width == 3.5), in 'or' condition is completely ignored and data satisfying only first condition (length == 4.0) will be returned.

>>> df[~(s_len != p_len + 1)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[(s_len > 5.5) & (p_len < 5)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
95            5.6          2.7           4.2          1.3       2
70            5.6          2.5           3.9          1.1       2
100           5.7          2.8           4.1          1.3       2
75            6.4          2.9           4.3          1.3       2
65            5.6          2.9           3.6          1.3       2
15            5.8          4.0           1.2          0.2       1
55            6.5          2.8           4.6          1.5       2
80            5.7          2.6           3.5          1.0       2
>>> df[(s_len >= 6.3) | (p_len == 4.2)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
110           7.2          3.6           6.1          2.5       3
75            6.4          2.9           4.3          1.3       2
130           7.2          3.0           5.8          1.6       3
145           6.7          3.3           5.7          2.5       3
140           6.9          3.1           5.4          2.1       3
125           6.7          3.3           5.7          2.1       3
105           6.5          3.0           5.8          2.2       3
95            5.6          2.7           4.2          1.3       2
55            6.5          2.8           4.6          1.5       2

It is recommended to surround expressions in parentheses in order to avoid ambiguity in operator precedence, similar to pandas.

For example, if in the example above, the expression is taken without the parenthesis, an error is given.
df[2 >= s_len | 4 <= p_len]
 
 
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
...
ArgumentError: SQL expression object or string expected, got object of type <class 'int'> instead
...
TeradataMlException: [Teradata][teradataml](TDML_2021) Unable to retrieve information for the DataFrame.

Supported Types and Operators

Python int, float, Decimal, str, and None literals can be used in filtering expressions.

All arithmetic expressions except floor division (//) and power (**) can be used.

All logical operators are supported.