to_numeric

Teradata® Python Package User Guide

brand
Teradata Vantage
prodname
Teradata Python Package
vrm_release
16.20
category
User Guide
featnum
B700-4006-098K

The to_numeric function converts a string-like representation of a number to a numeric type. It can be used with the string columns of the DataFrame in the DataFrame.assign method.

Example Prerequisite

A DataFrame with all string type columns.

>>> df = DataFrame('numeric_strings')
  
               hex decimal commas numbers
     0        19FF   00.77   08,8       1
     1        abcd    0.77   0,88       1
     2  ABCDEFABCD   0.7.7   ,088     999
     3        2018    .077   088,       0
>>> df.dtypes
  
     hex        str
     decimal    str
     commas     str
     numbers    str

Example: Convert to Numeric Type

To use the DataFrame in a numerical calculation, they first need to be converted to numeric type.

>>> from teradataml import to_numeric
>>> df = df.assign(drop_columns = True,
               numbers = df.numbers,
               numeric = to_numeric(df.numbers))
  
       numbers  numeric
    0        1        1
    1        1        1
    2      999      999
    3        0        0
>>> df.dtypes
 
numbers                str
numeric    decimal.Decimal

Example: Use Optional format keyword when Converting

The to_numeric function may not be able to parse the string into a numeric value if the string has an unrecognizable format. It returns None in this case.

# converting decimal-like strings to numeric
>>> df.assign(drop_columns = True,
              decimal = df.decimal,
              numeric_dec = to_numeric(df.decimal))
  
  decimal numeric_dec
0   00.77         .77
1    0.77         .77
2   0.7.7        None
3    .077        .077

You can control which strings are recognizable by passing a format string into the optional format keyword.

# converting comma (group separated) strings to numeric
>>> df.assign(drop_columns = True,
               commas = df.commas,
               numeric_commas = to_numeric(df.commas, format_ = '9G99'))
  
       commas numeric_commas
     0   08,8           None
     1   0,88             88
     2   ,088           None
     3   088,           None
# converting hex strings to numeric
 >>> df.assign(drop_columns = True,
               hex = df.hex,
               numeric_hex = to_numeric(df.hex, format_ = 'XXXXXXXXXX'))
  
               hex   numeric_hex
     0        19FF          6655
     1        abcd         43981
     2  ABCDEFABCD  737894443981
     3        2018          8216
The format string follows the syntax of the to_number function in the SQL Engine.
For more information, see the Data Type Conversion Functions section in the Teradata® Database SQL Functions, Operators, Expressions, and Predicates, Release 16.20.

Example: Use String Literals as Arguments

The to_numeric function can take DataFrame columns or string literals as arguments.

# converting literals to numeric
 >>> df.assign(drop_columns = True,
               a = to_numeric('123,456',format_ = '999,999'),
               b = to_numeric('1,333.555', format_ = '9,999D999'),
               c = to_numeric('2,333,2',format_ = '9G999G9'),
               d = to_numeric('3E20'),
               e = to_numeric('$41.99', format_ = 'L99.99'),
               f = to_numeric('$.12', format_ = 'L.99'),
               g = to_numeric('dollar123,456.00',
                              format_ = 'L999G999D99',
                              nls = {'param': 'currency',
                                     'value': 'dollar'})).head(1)
  
         a         b      c                         d      e    f       g
     0  123456  1333.555  23332 300000000000000000000  41.99  .12  123456