TD_RowNormalizeTransform normalizes input columns row-wise, using TD_RowNormalizeFit output.
Row normalization is a technique to transform a matrix or a dataset so that each row has the same magnitude or scale. This is typically done to make it easier to compare rows and to avoid bias towards variables with higher values.
Suppose you have a table that represents the daily sales figures for a retail store over a period of five days.
Day 1 | Day 2 | Day 3 | Day 4 | Day 5 |
---|---|---|---|---|
120 | 150 | 80 | 200 | 90 |
90 | 110 | 100 | 120 | 130 |
200 | 180 | 150 | 170 | 190 |
One method of normalization is to divide each element in a row by the sum of all the elements in that row.
Day 1 | Day 2 | Day 3 | Day 4 | Day 5 |
---|---|---|---|---|
120/640 | 150/640 | 80/640 | 200/640 | 90/640 |
90/550 | 110/550 | 100/550 | 120/550 | 130/550 |
200/890 | 180/890 | 150/890 | 170/890 | 190/890 |
The result is a table where each row has a sum of 1, representing a proportion or percentage of the total sales for each day.
Day 1 | Day 2 | Day 3 | Day 4 | Day 5 |
---|---|---|---|---|
0.1875 | 0.2344 | 0.125 | 0.3125 | 0.1406 |
0.2079 | 0.2539 | 0.2308 | 0.2771 | 0.3001 |
0.2247 | 0.2022 | 0.1685 | 0.1910 | 0.2135 |
Each row contains normalized values for total sales for each day, making it easier to compare the performance of the store on different days using a machine learning pipeline because all values have the same impact and magnitude.