Tokenizer - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
zuq1752009390153.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage

Arguments

This function internally uses Database Engine 20 NGramsplitter function through teradataml functions. Transformed data will have extracted tokens in multiple rows for the output column.

PySpark Argument Name Open Source Function Argument Name Notes
inputCol text_column  
outputCol n_gram_column  

Attributes/Methods

Attribute/Method Name Supported Notes
clear  
copy  
explainParam  
extractParamMap  
getInputCol  
getOrDefault  
getOutputCol  
getParam  
hasDefault  
hasParam  
isDefined  
isSet  
load  
read  
save  
set  
setInputCol  
setOutputCol  
setParams  
transform  
write  
inputCol  
outputCol  
params