String Comparisons may be Case Insensitive | Teradata Python Package - String Comparisons may be Case Insensitive - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905

Comparing string literals when filtering with the teradataml DataFrame is not necessarily case sensitive.

All character data, except for CLOBs, accessed in the execution of a Teradata SQL statement has an attribute of CASESPECIFIC or NOT CASESPECIFIC, either by default or by explicit designation. Character string comparisons use this attribute to determine whether the comparison is case blind or case specific. Case specificity does not apply to CLOBs.

For example:

>>> df.head(5)
               SepalLength  SepalWidth  PetalLength  PetalWidth         Name
                                                                
2                      4.7         3.2          1.3         0.2  Iris-setosa
4                      5.0         3.6          1.4         0.2  Iris-setosa
3                      4.6         3.1          1.5         0.2  Iris-setosa
1                      4.9         3.0          1.4         0.2  Iris-setosa
0                      5.1         3.5          1.4         0.2  Iris-setosa
 
>>> df[df['Name'] == 'iris-SETOSA'].head(5)
 
 
               SepalLength  SepalWidth  PetalLength  PetalWidth         Name
                                                                
2                      4.7         3.2          1.3         0.2  Iris-setosa
4                      5.0         3.6          1.4         0.2  Iris-setosa
3                      4.6         3.1          1.5         0.2  Iris-setosa
1                      4.9         3.0          1.4         0.2  Iris-setosa
0                      5.1         3.5          1.4         0.2  Iris-setosa

A workaround is to use the str.contains method with case = True.

>>> has_SETOSA = df['Name'].str.contains('iris-SETOSA', case = True)
>>> df[has_SETOSA == True]
 
Empty DataFrame
Columns: [SepalLength, SepalWidth, PetalLength, PetalWidth, Name]
Index: []