Step 2: Review the HTML Report | teradataml open-source machine learning - Step 2: Review the HTML Report - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
VMware
Enterprise
IntelliFlex
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2026-01-07
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
zuq1752009390153.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage

The generated Python script/notebook may or may not run directly on Vantage. The pyspark2teradataml utility function takes care of most of the conversion, but there may be some instances where the generated script/notebook requires additional manual changes. Review the generated HTML file to understand the manual changes.

PySpark script or Jupyter notebook input

The HTML report has two panes.

The left pane displays both the original PySpark script and the converted teradatamlspk script. Use a dropdown menu to switch between the two.
  • Color-coded bell icons that represent unique alerts are located next to lines in the original PySpark script that require attention.
  • No bell icons are shown in the teradatamlspk script which represents the generated output.
The right pane displays important notes about teradatamlspk as well as conversion summary by file, conversion summary by function/module, and instructions for the icons.
  • Black bell icon: Notifications. Alerts under this category requires no action from user. This is just a notification to the user.
  • Blue bell icon: Partially supported APIs. Alerts under this category may need minor change to run the corresponding API on Vantage.
  • Red bell icon: Unsupported APIs. Alerts under this category require an alternative implementation since the corresponding APIs are not supported.
  • Green tick: Alerts under this category signify the file was successfully converted and can be run as is on Vantage.
  • Bug icon: Alerts under this category signify the conversion failed, and the teradatamlspk script was not generated.
  • File icon: Alerts under this category signify the input file is empty, and the teradatamlspk script was not generated.
Select a bell icon to display details for the corresponding line API in the right pane:
  • Differences between PySpark and teradatamlspk for that API.
  • Examples demonstrating the differences.
  • Required user action to run the corresponding API with teradatamlspk.

Directory input

The index file has two panes.

The left pane displays a list of all scripts and notebooks in the provided directory, showing their full paths. Files that were not processed are highlighted in red and preceded by a bug report icon. Select a filename to open corresponding file report.

The right pane displays important notes and details about objects related to the selected bell icon and summary of statistics for directory processing includes:
  • Total number of files converted
  • Total number of files not converted
  • Total number of empty files
  • Total number of files processed
Take action on the notes generated in the HTML report before running the resulting script or notebook with the data in Vantage.

Examples for pyspark2teradataml

Examples can be found in the attachment list associated with this guide. In the left pane, select attachments and download examples_pyspark2teradataml_migration.zip.