td_sklearn | teradataml open-source machine learning functions - td_sklearn - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Currently, teradataml open-source machine learning module exposes scikit-learn package dynamically through an interface object td_sklearn. Current implementation exposes around 94 percent of classes and around 91 percent of class methods supported by scikit-learn.

Use this module to execute any scikit-learn function with same syntax and arguments using the interface object. teradataml open-source machine learning functions can be used to achieve:
  • Load and deploy scikit-learn models (model generated by teradataml OpenSourceML as well as external models)
  • Support classification and regression metrics

With td_sklearn, you can easily run any scikit-learn function inside Vantage where data reside, that is, without any data transfer, using the Massively Parallel Processing (MPP) capabilities. While doing so, you do not have to worry about usage and function syntaxes. To ease the usage, teradataml td_sklearn supports multiple syntaxes as follows:

  • Syntax 1: Using well known scikit-learn function syntax where arguments, X and y are passed.
  • Syntax 2: Alternative to the legacy arguments X and y, Teradata introduces another set of arguments data, feature_columns, label_columns, group_columns.
The following sections discuss about how to use teradataml's td_sklearn to run scikit-learn using different syntaxes, generating classification and regression metrics, generating single model and distributed-model (multi-model) support through partition_columns argument, additional support for load and deploy scikit-learn models, the supportability information and limitations and considerations.
The examples use only specific scikit-learn function, but the same logic is applicable for all other scikit-learn functions.