Arguments
This function uses Analytics Database function NGramSplitter through teradataml Analytics Database functions.
Transformed data will have extracted tokens in multiple rows for the output columns.
PySpark Argument Name | Open Source Function Argument Name | Notes |
---|---|---|
gaps | overlapping | |
inputCol | text_column | |
minTokenLength | grams | |
outputCol | n_gram_column | |
pattern | delimiter | Default value [\s]+ |
toLowercase | to_lower_case |
Attributes/Methods
Attribute/Method Name | Supported | Notes |
---|---|---|
clear | ||
copy | ||
explainParam | ||
explainParams | ||
extractParamMap | ||
getGaps | ||
getInputCol | ||
getMinTokenLength | ||
getOrDefault | ||
getOutputCol | ||
getParam | ||
getPattern | ||
getToLowercase | ||
hasDefault | ||
hasParam | ||
isDefined | ||
isSet | ||
load | ||
read | ||
save | ||
set | ||
setGaps | ||
setInputCol | ||
setMinTokenLength | ||
setOutputCol | ||
setParams | ||
setPattern | ||
setToLowerCase | ||
transform | ||
write | ||
gaps | ||
inputCol | ||
minTokenLength | ||
outputCol | ||
params | ||
pattern | ||
toLowerCase |