| |
- soundex(string_expression)
- DESCRIPTION:
Function returns a character string that represents the Soundex code for
string_expression.
Soundex is a system that codes surnames having the same or similar sounds,
but variant spellings. The Soundex system was first used by the National
Archives in 1880 to index the United States census.
Soundex codes begin with the first letter of the surname followed by a three-digit
code. Zeros are added to names that do not have enough letters.
PARAMETERS:
string_expression:
Required Argument.
Specifies a ColumnExpression of a string column or a string literal
that contains a surname to be evaluated in simple Latin characters.
Soundex is case insensitive.
Format of a ColumnExpression of a string column: '<dataframe>.<dataframe_column>.expression'.
NOTE:
Function accepts positional arguments only.
EXAMPLES:
# Load the data to run the example.
>>> load_example_data("dataframe", "admissions_train")
>>>
# Create a DataFrame on 'admissions_train' table.
>>> admissions_train = DataFrame("admissions_train")
>>> admissions_train
masters gpa stats programming admitted
id
22 yes 3.46 Novice Beginner 0
36 no 3.00 Advanced Novice 0
15 yes 4.00 Advanced Advanced 1
38 yes 2.65 Advanced Beginner 1
5 no 3.44 Novice Novice 0
17 no 3.83 Advanced Advanced 1
34 yes 3.85 Advanced Beginner 0
13 no 4.00 Advanced Novice 1
26 yes 3.57 Advanced Advanced 1
19 yes 1.98 Advanced Advanced 0
>>>
# Example returns the Soundex code for character string in "programming" column.
# Import func from sqlalchemy to execute soundex function.
>>> from sqlalchemy import func
# Create a sqlalchemy Function object.
>>> soundex_func_ = func.soundex(admissions_train.programming.expression)
>>>
# Pass the Function object as input to DataFrame.assign().
>>> df = admissions_train.assign(soundex_programming_=soundex_func_)
>>> print(df)
masters gpa stats programming admitted soundex_programming_
id
13 no 4.00 Advanced Novice 1 N120
26 yes 3.57 Advanced Advanced 1 A315
5 no 3.44 Novice Novice 0 N120
19 yes 1.98 Advanced Advanced 0 A315
15 yes 4.00 Advanced Advanced 1 A315
40 yes 3.95 Novice Beginner 0 B256
7 yes 2.33 Novice Novice 1 N120
22 yes 3.46 Novice Beginner 0 B256
36 no 3.00 Advanced Novice 0 N120
38 yes 2.65 Advanced Beginner 1 B256
>>>
|