Multiple Linear Regression analysis predicts the value of a dependent variable as a linear combination of independent variables, usually including a constant term. That is, it finds the b-coefficients in the following equation to predict the value of dependent variable y based on independent variables x1 through xn.
The best values of the b-coefficients are those that minimize the sum-of-squared error values over all observations:
The immediately preceding formula requires the actual value of y for each observation, for contrast with the predicted value .
Least-Squared Errors
Multiple Linear Regression analysis uses a technique called least-squared errors. To minimize the sum of squared errors, this technique expands the equation for the sum of squared errors using the equation for the estimated y value and then derives the partial derivatives of this equation with respect to each b-coefficient and sets them to 0. This operation finds the minimum with respect to all coefficient values and results in n simultaneous equations in n unknowns (called the normal equations). For example:
Matrix algebra can solve the preceding equations by computing the extended sum-of-squares-and-cross-products (ESSCP) matrix for the constant 1 and the variables x1, x2 and y (that is, by computing all the ∑ terms in the equation).
The Analytics Library Matrix Building function matrix builds the ESSCP matrix in the database. The linear regression function linear reads the ESSCP matrix and calculates the values for solving for the least-squares b-coefficients.