Teradata Package for Python Function Reference - 17.00 - string_cs - Teradata Package for Python

Teradata® Package for Python Function Reference

Product
Teradata Package for Python
Release Number
17.00
Release Date
April 2021
Content Type
Programming Reference
Publication ID
B700-4008-070K
Language
English (United States)
 
 
string_cs

 
Functions
       
string_cs(string_expression)
DESCRIPTION:
    Function returns a heuristically derived integer value that you can use to help determine
    which KANJI1-compatible client character set was used to encode string_expression.
 
    The result value can also help determine which client character set to use to interpret
    the character data.
    ===========================================================================================
    | IF the result value is …  | THEN the heuristic found that string_expression  …            |
     ===========================================================================================
    |   -1                      |   most likely uses a single-byte client character set         |
    |                           |   encoding, but it may also contain a mix of encodings.       |
     -------------------------------------------------------------------------------------------
    |   0                       |   does not contain anything distinguishable from any          |
    |                           |   particular character set, so any character set that you use |
    |                           |   to interpret string_expression provides the same result.    |
    |                           |   Not all translations use the same interpretation for the    |
    |                           |   characters represented by 0x5C and 0x7E, however.           |
    |                           |   If string_expression contains:                              |
    |                           |       * 0x5C and you want it to be interpreted as             |
    |                           |         REVERSE SOLIDUS, use a single-byte character set.     |
    |                           |       * 0x7E and you want it to be interpreted as TILDE, use  |
    |                           |         a single-byte character set.                          |
    |                           |       * 0x5C and you want it to be interpreted as YEN SIGN,   |
    |                           |       * 0x7E and you want it to be interpreted as OVERLINE,   |
    |                           |         use any of the following:                             |
    |                           |           * KANJISJIS_0S                                      |
    |                           |           * KANJIEBCDIC5026_0I                                |
    |                           |           * KANJIEBCDIC5035_0I                                |
    |                           |           * KATAKANAEBCDIC                                    |
    |                           |           * KANJIEUC_0U                                       |
     -------------------------------------------------------------------------------------------
    |   1                       |   uses the encoding of one of the following:                  |
    |                           |       * KANJIEBCDIC5026_0I                                    |
    |                           |       * KANJIEBCDIC5035_0I                                    |
    |                           |       * KATAKANAEBCDIC                                        |
     -------------------------------------------------------------------------------------------
    |   2                       |   uses the encoding of KANJIEUC_0U.                           |
     -------------------------------------------------------------------------------------------
    |   3                       |   uses the encoding of KANJISJIS_0S.                          |
     -------------------------------------------------------------------------------------------
 
    Function helps determine which encoding to use when using the TRANSLATE function to
    translate a string from the KANJI1 server character set to the UNICODE server character set.
     ===========================================================================================
    | IF the result value is …  | THEN substitute the following value for source_TO_target in   |
    |                           | TRANSLATE(string_expression USING source_to_target ) …        |
     ===========================================================================================
    |   -1                      |   KANJI1_SBC_TO_UNICODE.                                      |
     -------------------------------------------------------------------------------------------
    |   0                       |   KANJI1_SBC_TO_UNICODE.                                      |
     -------------------------------------------------------------------------------------------
    |   1                       |   KANJI1_KANJIEBCDIC_TO_UNICODE.                              |
     -------------------------------------------------------------------------------------------
    |   2                       |   KANJI1_KANJIEUC_TO_UNICODE.                                 |
     -------------------------------------------------------------------------------------------
    |   3                       |   KANJI1_KANJISJIS_TO_UNICODE.                                |
     -------------------------------------------------------------------------------------------
 
PARAMETERS:
    string_expression:
        Required Argument.
        Specifies a ColumnExpression of a string column or a string literal.
        Format of a ColumnExpression of a string column: '<dataframe>.<dataframe_column>.expression'.
        Supported column types: CHAR and VARCHAR
 
NOTE:
    Function accepts positional arguments only.
 
EXAMPLES:
    # Load the data to run the example.
    >>> load_example_data("dataframe", "admissions_train")
    >>>
 
    # Create a DataFrame on 'admissions_train' table.
    >>> admissions_train = DataFrame("admissions_train")
    >>> admissions_train
       masters   gpa     stats programming  admitted
    id
    22     yes  3.46    Novice    Beginner         0
    36      no  3.00  Advanced      Novice         0
    15     yes  4.00  Advanced    Advanced         1
    38     yes  2.65  Advanced    Beginner         1
    5       no  3.44    Novice      Novice         0
    17      no  3.83  Advanced    Advanced         1
    34     yes  3.85  Advanced    Beginner         0
    13      no  4.00  Advanced      Novice         1
    26     yes  3.57  Advanced    Advanced         1
    19     yes  1.98  Advanced    Advanced         0
    >>>
 
    # Example returns the heuristically derived integer value for character string in "stats" column.
    # Import func from sqlalchemy to execute string_cs function.
    >>> from sqlalchemy import func
 
    # Create a sqlalchemy Function object.
    >>> string_cs_func_ = func.string_cs(admissions_train.stats.expression)
    >>>
 
    # Pass the Function object as input to DataFrame.assign().
    >>> df = admissions_train.assign(string_cs_stats_=string_cs_func_)
    >>> print(df)
       masters   gpa     stats programming  admitted  string_cs_stats_
    id
    5       no  3.44    Novice      Novice         0                 0
    34     yes  3.85  Advanced    Beginner         0                 0
    13      no  4.00  Advanced      Novice         1                 0
    40     yes  3.95    Novice    Beginner         0                 0
    22     yes  3.46    Novice    Beginner         0                 0
    19     yes  1.98  Advanced    Advanced         0                 0
    36      no  3.00  Advanced      Novice         0                 0
    15     yes  4.00  Advanced    Advanced         1                 0
    7      yes  2.33    Novice      Novice         1                 0
    17      no  3.83  Advanced    Advanced         1                 0
    >>>