teradataml Open-Source Machine Learning Functions - teradataml Open-Source Machine Learning Functions - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
March 2025
ft:locale
en-US
ft:lastEdition
2025-04-02
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

The Teradata Package for Python introduces teradataml open-source machine learning functions, referred as teradataml OpenSourceML, which exposes most of the functionality of open-source packages like scikit-learn, and so on. With teradataml open-source machine learning functions, you can run these open-source packages without needing to pull the data to your client. It offers a simple interface object for the open-source packages, allowing them to be used with the same syntax and arguments as the actual open-source packages' functions and classes.

Functions/classes from open-source packages generates a single model that is trained on all the data. Unlike traditional open-source packages, you can use teradataml OpenSourceML to generate distributed models, also known as multiple models or micro models.

In combination with the MPP architecture that Vantage provides, teradataml OpenSourceML can tap, process and solve a large set of use cases where distributed models are needed. To enable this support, teradataml OpenSourceML introduces the partition_columns argument, which can be used in all functions; partition_columns accepts the column to be used to partition the data and generate the models for partitioned data.

Supported open-source packages:
  • scikit-learn
  • lightGBM
The following topics provide setup, module, and usage information: