In the field of science, collecting data and fitting it with model is essential. The most common type of fitting is 1-dimensional fitting, as there is only one independent variable. By fitting, we usually mean the least-squared method.
Suppose we want to find the n parameters in a linear function
with m observed experimental data
Thus, we have a matrix equation
where is a m-dimensional data column vector, is a n-dimensional parameter column vector, and is a n-m non-square matrix.
In order to get the parameter, the number of data . when , it is not really a fitting because of degree-of-freedom is , so that the fitting error is infinity.
The least square method in matrix algebra is like calculation. Take both side with transpose of
Since the expectation of the is zero. Thus the expected parameter is
The unbiased variance is
where is the degree of freedom, which is the number of value that are free to vary. Many people will confuse by the “-1” issue. In fact, if you only want to calculate the sum of square of residual SSR, the degree of freedom is always .
The covariance of the estimated parameters is
This is only a fast-food notices on the linear regression. This has a geometrical meaning that the matrix is the sub-space of parameters with basis formed by the column vectors of . is a bit out-side the sub-space. The linear regression is a method to find the shortest distance from to the sub-space .
The from of the variance can be understood using Taylor series. This can be understood using variance in matrix notation .