Work with teradatamlspk StandardScaler Function - Work with teradatamlspk StandardScaler Function - Teradata Vantage

Teradata® VantageCloud Lake

Deployment
VantageCloud
Edition
Lake
Product
Teradata Vantage
Published
January 2023
ft:locale
en-US
ft:lastEdition
2024-12-11
dita:mapPath
phg1621910019905.ditamap
dita:ditavalPath
pny1626732985837.ditaval
dita:id
phg1621910019905
  1. Load required packages and create DataFrame.
    >>> from teradatamlspk.ml.feature import StandardScaler
    >>> from teradatamlspk.sql.functions import monotonically_increasing_id
    
    >>> df = df.select(monotonically_increasing_id().alias('id'), "feature1", "feature2", "feature3")
    >>> df.show()
    +--+--------+--------+--------+
    |id|feature1|feature2|feature3|
    +--+--------+--------+--------+
    | 3|     3.0|    10.1|     3.0|
    | 2|     2.0|     1.1|     1.0|
    | 1|     1.0|     0.1|    -1.0|
    +--+--------+--------+--------+
    
  2. Run StandardScaler function.
    >>> scaler = StandardScaler(inputCol=["feature2", "feature3"], withMean=True)
    >>> scaled_df = scaler.fit(df).transform(df)
    +--+--------+------------------+--------+
    |id|feature1|          feature2|feature3|
    +--+--------+------------------+--------+
    | 3|     3.0| 6.333333333333332|     2.0|
    | 2|     2.0|-2.666666666666667|     0.0|
    | 1|     1.0|-3.666666666666667|    -2.0|
    +--+--------+------------------+--------+