TD_RegressionEvaluator Usage Notes | Teradata Vantage - TD_RegressionEvaluator Usage Notes - Analytics Database

Database Analytic Functions

Deployment
VantageCloud
VantageCore
Edition
Enterprise
IntelliFlex
VMware
Product
Analytics Database
Release Number
17.20
Published
June 2022
Language
English (United States)
Last Update
2024-04-06
dita:mapPath
gjn1627595495337.ditamap
dita:ditavalPath
ayr1485454803741.ditaval
dita:id
jmh1512506877710
Product Category
Teradata Vantage™

In linear regression, the relationship between the dependent variable Y and independent variables X is represented by a straight line.



The equation for a simple linear regression model is:

Y= β0+ β1 X+ ϵ

where Y is the dependent variable, X is the independent variable, β0 is the intercept, β1 is the slope, and ∈ is the error term. The slope represents the change in Y for a unit change in X, and the intercept represents the value of Y when X is zero.

The goal of linear regression is to estimate the values of β0 and β1 that minimize the sum of squared errors (SSE) between the predicted values and actual values. SSE is calculated as:

SSE= ∑ (Y-Ŷ)

where Y is the actual value of the dependent variable, and Ŷ is the predicted value.

There are techniques used to estimate the values of β0 and β1, such as ordinary least squares (OLS) and gradient descent. OLS is a method that finds the values of β0 and β1 that minimize SSE by calculating their partial derivatives with respect to SSE and setting them to zero. Gradient descent is an optimization algorithm that iteratively adjusts the values of β0 and β1 to minimize SSE.

Use these metrics to evaluate the performance of a linear regression model:

Metric Description
R-squared Measures the proportion of variation in the dependent variable explained by the independent variables
Mean squared error (MSE) Measures the average of the squared differences between the predicted values and actual values.
Mean absolute error (MAE) Measures the average of the absolute differences between the predicted values and actual values.