Work with teradatamlspk StandardScaler Function - Work with teradatamlspk StandardScaler Function - Teradata Package for Python

Teradata® pyspark2teradataml User Guide

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Teradata Package for Python
Release Number
20.00
Published
December 2024
Language
English (United States)
Last Update
2024-12-18
dita:mapPath
oeg1710443196055.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
oeg1710443196055
Product Category
Teradata Vantage
  1. Load required packages and create DataFrame.
    >>> from teradatamlspk.ml.feature import StandardScaler
    >>> from teradatamlspk.sql.functions import monotonically_increasing_id
    
    >>> df = df.select(monotonically_increasing_id().alias('id'), "feature1", "feature2", "feature3")
    >>> df.show()
    +--+--------+--------+--------+
    |id|feature1|feature2|feature3|
    +--+--------+--------+--------+
    | 3|     3.0|    10.1|     3.0|
    | 2|     2.0|     1.1|     1.0|
    | 1|     1.0|     0.1|    -1.0|
    +--+--------+--------+--------+
    
  2. Run StandardScaler function.
    >>> scaler = StandardScaler(inputCol=["feature2", "feature3"], withMean=True)
    >>> scaled_df = scaler.fit(df).transform(df)
    +--+--------+------------------+--------+
    |id|feature1|          feature2|feature3|
    +--+--------+------------------+--------+
    | 3|     3.0| 6.333333333333332|     2.0|
    | 2|     2.0|-2.666666666666667|     0.0|
    | 1|     1.0|-3.666666666666667|    -2.0|
    +--+--------+------------------+--------+