Multi model case - LGBMClassifier | teradataml open-source ML functions - Multi model case - LGBMClassifier - Teradata Package for Python

Teradata® Package for Python User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
ft:locale
en-US
ft:lastEdition
2025-01-23
dita:mapPath
nvi1706202040305.ditamap
dita:ditavalPath
plt1683835213376.ditaval
dita:id
rkb1531260709148
Product Category
Teradata Vantage

For multi model training, fit() method should take partition_columns argument. These columns should be present in the parent DataFrame from which X and y teradataml DataFrames are derived from.

Create LGBMClassifier object

>>> obj = td_lightgbm.LGBMClassifier(num_leaves=5, n_estimators=15, learning_rate=0.01)
>>> obj
LGBMClassifier(learning_rate=0.01, n_estimators=15, num_leaves=5)

Set/update parameters

>>> obj.set_params(n_estimators=10)
LGBMClassifier(learning_rate=0.01, n_estimators=10, num_leaves=5)

Train the model

>>> obj.fit(df_x_classif, df_y_classif, callbacks=[td_lightgbm.log_evaluation()],
            partition_columns=["partition_column_1", "partition_column_2"])
   partition_column_1  partition_column_2                                                              model
0                   1                  11  LGBMClassifier(learning_rate=0.01, n_estimators=10, num_leaves=5)
1                   0                  11  LGBMClassifier(learning_rate=0.01, n_estimators=10, num_leaves=5)
2                   1                  10  LGBMClassifier(learning_rate=0.01, n_estimators=10, num_leaves=5)
3                   0                  10  LGBMClassifier(learning_rate=0.01, n_estimators=10, num_leaves=5)

Predict the values

>>> obj.predict(X=df_x_classif)
partition_column_1	partition_column_2                  col1	             col2	             col3	            col4 lgbmclassifier_predict_1
                1	                10      0.99439439131549	-0.27567053456055	-0.70972796584688	1.73887267745451	                    0
                0	                10     0.978567297446043	0.025385604406459	0.610391764305413	0.28601252697811	                    0
                0	                10     -1.18468659041155	-0.85172919725359	1.822723600123796	-0.5215796779933	                    1
                1	                10       1.5363770542458	-0.11054065723247	1.020172711715805	-0.6920498477843	                    0
                1	                11     -2.73967716718956	-0.13010695419370	0.093953229385568	0.94304608732251	                    0
                1	                11     0.378162519602174	1.532779214358461	1.469358769900291	0.15494742569691	                    0
                1	                10     -1.82691137830452	0.917221542115856	-0.05704286766218	0.87672677369045	                    1
                1	                10     -0.37522240076067	0.434807957731158	0.540094460524806	0.73242400975487	                    1
                1	                11     -0.52118931230111	1.364531848102473	-0.68944918454993	-0.6522935999350	                    0
                1	                11     -0.76157338825655	-2.36417381714118	0.020334181705243	-1.3479254226291	                    0

Score per partition

>>> obj.score(df_x_classif, df_y_classif, sample_weight=df_train_classif.select("group_column"))
partition_column_1	partition_column_2	             score
                 0	                 11	0.8322981366459627
                 0	                 10	0.8705994291151284
                 1	                 10	0.5585585585585585
                 1	                 11	0.5515088449531738

Access attributes

>>> obj.feature_importances_
	partition_column_1	partition_column_2	feature_importances_
0	                 1	                11	     [10, 10, 10, 0]
1	                 0	                11	       [3, 7, 10, 3]
2	                 1	                10	       [7, 18, 1, 2]
3	                 0	                10	      [10, 20, 0, 0]

>>> obj.objective_
	partition_column_1	partition_column_2	objective_
0	                 1	                11	    binary
1	                 0	                11	    binary
2	                 1	                10	    binary
3	                 0	                10	    binary