Teradata Package for Python Function Reference | 20.00 - string_cs - Teradata Package for Python - Look here for syntax, methods and examples for the functions included in the Teradata Package for Python.

Teradata® Package for Python Function Reference - 20.00

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00.00.03
Published
December 2024
ft:locale
en-US
ft:lastEdition
2024-12-19
dita:id
TeradataPython_FxRef_Enterprise_2000
Product Category
Teradata Vantage
teradataml.dataframe.sql.DataFrameColumn.string_cs = string_cs()
DESCRIPTION:
    Function returns a heuristically derived integer value that you can use to help determine
    which KANJI1-compatible client character set was used to encode string_expression.
 
    The result value can also help determine which client character set to use to interpret
    the character data.
    ===========================================================================================
    | IF the result value is …  | THEN the heuristic found that string_expression  …            |
     ===========================================================================================
    |   -1                          |   most likely uses a single-byte client character set         |
    |                               |   encoding, but it may also contain a mix of encodings.       |
     -------------------------------------------------------------------------------------------
    |   0                           |   does not contain anything distinguishable from any          |
    |                               |   particular character set, so any character set that you use |
    |                               |   to interpret string_expression provides the same result.    |
    |                               |   Not all translations use the same interpretation for the    |
    |                               |   characters represented by 0x5C and 0x7E, however.           |
    |                               |   If string_expression contains:                              |
    |                               |       * 0x5C and you want it to be interpreted as             |
    |                               |         REVERSE SOLIDUS, use a single-byte character set.     |
    |                               |       * 0x7E and you want it to be interpreted as TILDE, use  |
    |                               |         a single-byte character set.                          |
    |                               |       * 0x5C and you want it to be interpreted as YEN SIGN,   |
    |                               |       * 0x7E and you want it to be interpreted as OVERLINE,   |
    |                               |         use any of the following:                             |
    |                               |           * KANJISJIS_0S                                      |
    |                               |           * KANJIEBCDIC5026_0I                                |
    |                               |           * KANJIEBCDIC5035_0I                                |
    |                               |           * KATAKANAEBCDIC                                    |
    |                               |           * KANJIEUC_0U                                       |
     -------------------------------------------------------------------------------------------
    |   1                           |   uses the encoding of one of the following:                  |
    |                               |       * KANJIEBCDIC5026_0I                                    |
    |                               |       * KANJIEBCDIC5035_0I                                    |
    |                               |       * KATAKANAEBCDIC                                        |
     -------------------------------------------------------------------------------------------
    |   2                           |   uses the encoding of KANJIEUC_0U.                           |
     -------------------------------------------------------------------------------------------
    |   3                           |   uses the encoding of KANJISJIS_0S.                          |
     -------------------------------------------------------------------------------------------
 
    Function helps determine which encoding to use when using the TRANSLATE function to
    translate a string from the KANJI1 server character set to the UNICODE server character set.
     ===========================================================================================
    | IF the result value is …  | THEN substitute the following value for source_TO_target in   |
    |                           | TRANSLATE(string_expression USING source_to_target ) …        |
     ===========================================================================================
    |   -1                          |   KANJI1_SBC_TO_UNICODE.                                      |
     -------------------------------------------------------------------------------------------
    |   0                           |   KANJI1_SBC_TO_UNICODE.                                      |
     -------------------------------------------------------------------------------------------
    |   1                           |   KANJI1_KANJIEBCDIC_TO_UNICODE.                              |
     -------------------------------------------------------------------------------------------
    |   2                           |   KANJI1_KANJIEUC_TO_UNICODE.                                 |
     -------------------------------------------------------------------------------------------
    |   3                           |   KANJI1_KANJISJIS_TO_UNICODE.                                |
     -------------------------------------------------------------------------------------------
 
RAISES:
    TypeError, ValueError, TeradataMlException
 
RETURNS:
    DataFrameColumn
 
EXAMPLES:
    # Load the data to run the example.
    >>> load_example_data("dataframe", "admissions_train")
 
    # Create a DataFrame on 'admissions_train' table.
    >>> df = DataFrame("admissions_train").iloc[:4]
    >>> print(df)
       masters   gpa     stats programming  admitted
    id
    3       no  3.70    Novice    Beginner         1
    4      yes  3.50  Beginner      Novice         1
    2      yes  3.76  Beginner    Beginner         0
    1      yes  3.95  Beginner    Beginner         0
 
    # Example 1: Returns the heuristically derived integer value for character string in "stats"
    #            column and pass it as input to DataFrame.assign().
    >>> res = df.assign(col = df.stats.string_cs())
    >>> print(res)
       masters   gpa     stats programming  admitted  col
    id
    3       no  3.70    Novice    Beginner         1    0
    4      yes  3.50  Beginner      Novice         1    0
    2      yes  3.76  Beginner    Beginner         0    0
    1      yes  3.95  Beginner    Beginner         0    0