In this example, the "data" dictionary contains list of numbers that become the columns X and Y in the uploaded table and the subsequent "normalize_test" teradataml DataFrame.
- Define the "data" dictionary.
data = { 'X':[1, 2, 3], 'Y':[45, 65, 89] }
- Convert the dictionary into DataFrame.
df = pd.DataFrame(data)
copy_to_sql(df = df, table_name = 'normalize_test', if_exists="replace")
- Print the original DataFrame.
print("Original DataFrame:\n", df)
Out:
Original DataFrame: X Y 0 1 45 1 2 65 2 3 89
- Create a DataFrame for normalization.
normalize_test = DataFrame.from_table("normalize_test")
- Compose a function "normalize" that performs normalization on specific rows.
from numpy import asarray
def normalize(row): x_new = ((row['X'] - np.mean([row['X'], row['Y']])) / (max(row['X'], row['Y']) - min(row['X'], row['Y']))) return asarray([x_new, row['X']])
- Call the teradataml.DataFrame.apply method on the "normalize_test" teradataml DataFrame with the function built in previous step. The operation results are stored in the "output" variable.
output = normalize_test.apply(normalize, env_name=testenv, returns = OrderedDict([('X_NEW', FLOAT()), ('Y', INTEGER())]))
- Print the normalized data.
print('\nNormalized:\n', output)
Out:
Normalized: X_NEW Y 0 -0.5 1 1 -0.5 2 2 -0.5 3