Single model training | teradataml open-source machine learning functions - Single model training - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

Once the required teradataml DataFrames are created, you need to create Dataset objects.

>>> obj_s1 = td_lightgbm.Dataset(df_x_classif, df_y_classif, silent=True, free_raw_data=False)
>>> obj_s1
<lightgbm.basic.Dataset object at 0x7f5553c28c40>

After creating the Dataset object, run training function with record_evaluation and early_stopping callbacks:

>>> rec = {} # To pass this empty dictionary to record_evaluation callback.

# Training With valid_sets and callbacks argument.
>>> opt_tr_s = td_lightgbm.train(params={}, train_set=obj_s1, num_boost_round=30,
                                 callbacks=[td_lightgbm.record_evaluation(rec), td_lightgbm.early_stopping(3)],
                                 valid_sets=[obj_s1])
>>> opt_tr_s
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000065 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 532
[LightGBM] [Info] Number of data points in the train set: 400, number of used features: 4
Training until validation scores don't improve for 3 rounds
Did not meet early stopping. Best iteration is:
[30]	valid_0's l2: 0.0416953

<lightgbm.basic.Booster object at 0x7f5553d235b0>

Similar to the train() function of lightgbm, OpensourceML’s lightGBM also displays console output and returns Booster object. However, this Booster object is not lightGBM’s Booster object but it is OpensourceML’s internal Booster wrapper object.

Differences between training functionality of lightgbm and td_lightgbm follow:
  • lightgbm populates the record evaluation results in the variable that was passed to the record_evaluation function.
  • td_lightgbm provides an attribute record_evaluation_result for the object returned by train() method, which can be accessed as shown in the following example:
>>> opt_tr_s.record_evaluation_result
{'valid_0': OrderedDict([('l2',
               [0.21581071275252517,
                0.18813848372931546,
                0.16614597654803748,
                ...
                ...
                0.04169529314351532])])}