Introduction to Teradata pyspark2teradataml - Introduction to Teradata pyspark2teradataml - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2024-12-18
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage

teradatamlspk is the Python package name of Teradata product pyspark2teradataml. The teradatamlspk package is built as an extension of teradataml - a Teradata Package for Python.

Syntax and user accessibility of the teradatamlspk APIs are kept similar to PySpark APIs. This allows the existing PySpark workloads that runs on Spark engine to easily run on Teradata Vantage using ClearScape Analytics with minimal changes to the PySpark workloads.

teradatamlspk offers a function called pyspark2teradataml that enables conversion of a PySpark script/notebook to a teradatamlspk Python script/notebook. This function also generates HTML report for the conversion, which is useful for users to understand the changes done and carry out any manual changes in the generated teradatamlspk script/notebook, so that the script/notebook can be run on Vantage.