index[] Operator | Teradata Python Package - index[] Operator - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
Language
English (United States)
Last Update
2024-02-17
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Use the index operator ([ ]) to filter rows based on logical expressions given by Python literals and DataFrame columns.

Refer to teradataml DataFrame Column for more details about ColumnExpressions in teradataml.

Examples Prerequisite

>>> df = DataFrame('iris_test')
>>> df
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> s_len = df.sepal_length
>>> p_len = df.petal_length

Example: Inequality operations

The operators supported are <, <=, >, >=.

>>> df[s_len > p_len]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
5             5.0          3.6           1.4          0.2       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
65            5.6          2.9           3.6          1.3       2
20            5.1          3.8           1.5          0.3       1
15            5.8          4.0           1.2          0.2       1
80            5.7          2.6           3.5          1.0       2
>>> df[s_len < p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
105           6.5          3.0           5.8          2.2       3
85            5.4          3.0           4.5          1.5       2
115           5.8          2.8           5.1          2.4       3
150           5.9          3.0           5.1          1.8       3
135           6.1          2.6           5.6          1.4       3

Example: Equality operations

The operators supported are == and !=.

>>> df[s_len == p_len + 1]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[s_len != p_len + 2]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
120           6.0          2.2           5.0          1.5       3
20            5.1          3.8           1.5          0.3       1
60            5.2          2.7           3.9          1.4       2
10            4.9          3.1           1.5          0.1       1
30            4.7          3.2           1.6          0.2       1
70            5.6          2.5           3.9          1.1       2
15            5.8          4.0           1.2          0.2       1
5             5.0          3.6           1.4          0.2       1
80            5.7          2.6           3.5          1.0       2
40            5.1          3.4           1.5          0.2       1

Example: Boolean operations

The operators supported are &, |, ~.

Do not use 'and' operator, instead use '&' operator. Using 'and' boolean operator instead of '&' will return incorrect results.

For example, if condition used is as length == 4.0 and width == 3.5, the first condition in previous case (length == 4.0), in 'and' condition is completely ignored and data satisfying only second condition (width == 3.5) will be returned.

Do not use 'or' operator, instead use '|' operator. Using 'or' boolean operator instead of '|' will return incorrect results.

For example, if condition used is as length == 4.0 or width == 3.5, the second condition in previous case (width == 3.5), in 'or' condition is completely ignored and data satisfying only first condition (length == 4.0) will be returned.

>>> df[~(s_len != p_len + 1)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
125           6.7          3.3           5.7          2.1       3
145           6.7          3.3           5.7          2.5       3
120           6.0          2.2           5.0          1.5       3
>>> df[(s_len > 5.5) & (p_len < 5)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
95            5.6          2.7           4.2          1.3       2
70            5.6          2.5           3.9          1.1       2
100           5.7          2.8           4.1          1.3       2
75            6.4          2.9           4.3          1.3       2
65            5.6          2.9           3.6          1.3       2
15            5.8          4.0           1.2          0.2       1
55            6.5          2.8           4.6          1.5       2
80            5.7          2.6           3.5          1.0       2
>>> df[(s_len >= 6.3) | (p_len == 4.2)]
     sepal_length  sepal_width  petal_length  petal_width species
id                                                              
110           7.2          3.6           6.1          2.5       3
75            6.4          2.9           4.3          1.3       2
130           7.2          3.0           5.8          1.6       3
145           6.7          3.3           5.7          2.5       3
140           6.9          3.1           5.4          2.1       3
125           6.7          3.3           5.7          2.1       3
105           6.5          3.0           5.8          2.2       3
95            5.6          2.7           4.2          1.3       2
55            6.5          2.8           4.6          1.5       2

It is recommended to surround expressions in parentheses in order to avoid ambiguity in operator precedence, similar to pandas.

For example, in previous example, if the expression is taken without the parenthesis, an error is given.
df[2 >= s_len | 4 <= p_len]
 
 
---------------------------------------------------------------------------
ArgumentError                             Traceback (most recent call last)
...
ArgumentError: SQL expression object or string expected, got object of type <class 'int'> instead
...
TeradataMlException: [Teradata][teradataml](TDML_2021) Unable to retrieve information for the DataFrame.

Supported Types and Operators

Python int, float, Decimal, str, and None literals can be used in filtering expressions.

All arithmetic expressions except floor division (//) and power (**) can be used.

All logical operators are supported.