index[] Operator - Teradata Python Package

Teradata® Python Package User Guide

Product
Teradata Python Package
Release Number
16.20
Published
February 2020
Language
English (United States)
Last Update
2020-02-29
dita:mapPath
rkb1531260709148.ditamap
dita:ditavalPath
Generic_no_ie_no_tempfilter.ditaval
dita:id
B700-4006
lifecycle
previous
Product Category
Teradata Vantage

Use the index operator ([ ]) to filter rows based on logical expressions given by Python literals and DataFrame columns.

Refer to teradataml DataFrame Column for more details about ColumnExpressions in teradataml.

Examples Prerequisite

>>> df = DataFrame('iris_test')
>>> df
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> s_len = df.sepal_length
>>> p_len = df.petal_length

Example: Inequality operations

The operators supported are <, <=, >, >=.

>>> df[s_len > p_len]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> df[s_len < p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
105           6.5          3.0           5.8          2.2       3
85            5.4          3.0           4.5          1.5       2
115           5.8          2.8           5.1          2.4       3
150           5.9          3.0           5.1          1.8       3
135           6.1          2.6           5.6          1.4       3

Example: Equality operations

The operators supported are == and !=.

>>> df[s_len == p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[s_len != p_len + 2]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
20            5.1          3.8           1.5          0.3       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
15            5.8          4.0           1.2          0.2       1
5             5.0          3.6           1.4          0.2       1
80            5.7          2.6           3.5          1.0       2
40            5.1          3.4           1.5          0.2       1

Example: Boolean operations

The operators supported are &, |, ~.

Do not use 'and' operator, instead use '&' operator. Using 'and' boolean operator instead of '&' will return incorrect results.

For example, if condition used is as length == 4.0 and width == 3.5, the first condition in above case (length == 4.0), in 'and' condition is completely ignored and data satisfying only second condition (width == 3.5) will be returned.

Do not use 'or' operator, instead use '|' operator. Using 'or' boolean operator instead of '|' will return incorrect results.

For example, if condition used is as length == 4.0 or width == 3.5, the second condition in above case (width == 3.5), in 'or' condition is completely ignored and data satisfying only first condition (length == 4.0) will be returned.

>>> df[~(s_len != p_len + 1)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[(s_len > 5.5) & (p_len < 5)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
95            5.6          2.7           4.2          1.3       2
70            5.6          2.5           3.9          1.1       2
100           5.7          2.8           4.1          1.3       2
75            6.4          2.9           4.3          1.3       2
65            5.6          2.9           3.6          1.3       2
15            5.8          4.0           1.2          0.2       1
55            6.5          2.8           4.6          1.5       2
80            5.7          2.6           3.5          1.0       2
>>> df[(s_len >= 6.3) | (p_len == 4.2)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
110           7.2          3.6           6.1          2.5       3
75            6.4          2.9           4.3          1.3       2
130           7.2          3.0           5.8          1.6       3
145           6.7          3.3           5.7          2.5       3
140           6.9          3.1           5.4          2.1       3
125           6.7          3.3           5.7          2.1       3
105           6.5          3.0           5.8          2.2       3
95            5.6          2.7           4.2          1.3       2
55            6.5          2.8           4.6          1.5       2

It is recommended to surround expressions in parentheses in order to avoid ambiguity in operator precedence, similar to pandas.

For example, if in the example above, the expression is taken without the parenthesis, an error is given.
df[2 >= s_len | 4 <= p_len]
 
 
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
...
ArgumentError: SQL expression object or string expected, got object of type <class 'int'> instead
...
TeradataMlException: [Teradata][teradataml](TDML_2021) Unable to retrieve information for the DataFrame.

Supported Types and Operators

Python int, float, Decimal, str, and None literals can be used in filtering expressions.

All arithmetic expressions except floor division (//) and power (**) can be used.

All logical operators are supported.