Multi-dimension Linear Regression

Leave a comment

In the field of science, collecting data and fitting it with model is essential. The most common type of fitting is 1-dimensional fitting, as there is only one independent variable. By fitting, we usually mean the least-squared method.

Suppose we want to find the n parameters in a linear function

f(x_1, x_2,\cdots, x_n) = \sum_{i=1} a_i x_i

with m observed experimental data

Y_j = f(x_{1j}, x_{2j}, \cdot, x_{nj} + \epsilon_j= \sum_{i=1} a_i x_{ij}+ \epsilon_j

Thus, we have a matrix equation

Y=X \cdot A + \epsilon

where Y is a m-dimensional data column vector, A is a n-dimensional parameter column vector, and X is a n-m non-square matrix.

In order to get the n parameter, the number of data m >= n. when m=n, it is not really a fitting because of degree-of-freedom is DF = m-n = 0, so that the fitting error is infinity.

The least square method in matrix algebra is like calculation. Take both side with transpose of X

X^T \cdot Y = (X^T \cdot X) \cdot A + X^T \cdot \epsilon

(X^T\cdot X)^{-1} \cdot X^T \cdot Y = A + (X^T \cdot X)^{-1} \cdot X^T \cdot \epsilon

Since the expectation of the \epsilon is zero. Thus the expected parameter is

A = (X^T \cdot X)^{-1} \cdot X^T \cdot Y

The unbiased variance is

\sigma^2 = (Y - X\cdot A)^T \cdot (Y - X\cdot A) / DF

where DF is the degree of freedom, which is the number of value that are free to vary. Many people will confuse by the “-1” issue. In fact, if you only want to calculate the sum of square of residual SSR, the degree of freedom is always m - n.

The covariance of the estimated parameters is

Var(A) = \sigma^2 (X^T\cdot X)^{-1}

This is only a fast-food notices on the linear regression. This has a geometrical meaning  that the matrix X is the sub-space of parameters with basis formed by the column vectors of X. Y is a bit out-side the sub-space. The linear regression is a method to find the shortest distance from Y to the sub-space X .

The from of the variance can be understood using Taylor series. This can be understood using variance in matrix notation Var(A) = E( A - E(A) )^T \cdot E(A  - E(A)) .

 

 

 

Advertisements

Goodness of Fit

Leave a comment

We assumed each data point is taking from a distribution with mean \mu and variance \sigma^2

Y\sim D(\mu, \sigma^2)

in which, the mean can be a function of X.

For example, we have a data Y_i , it has relation with an independent variable X_i. We would like to know the relationship between Y_i and X_i, so we fit a function y = f(x).

After the fitting (least square method), we will have so residual for each of the data

e_i = y_i - Y_i

This residual  should be follow the distribution

e \sim D(0, \sigma_e^2)

The goodness of fit, is a measure, to see the distribution of the residual, agree with the experimental error of each point, i.e. \sigma

Thus, we would like to divide the residual with \sigma and define the chi-squared

\chi^2 = (\sum (e_i^2)/\sigma_{e_i}^2 ) .

we can see, the distribution of

e/\sigma_e \sim D(0, 1)

and the sum of this distribution would be the chi-squared distribution. It has a mean of the degree of freedom DF. Note that the mean and the peak of the chi-squared distribution is not the same that the peak at  DF-1.


In the case we don’t know the error, then, the sample variance of the residual is out best estimator of the true variance. The unbiased sample variance is

\sigma_s^2 = Var(e)/DF ,

where DF is degree of freedom. In the cause of f(x) = a x + b, the DF = n-1 , because there is 1 degree o freedom used in x. And because the 1  with the b is fixed, it provides no degree of freedom.

on angular momentum adding & rotation operator

Leave a comment

the angular momentum has 2 kinds – orbital angular momentum L , which is caused by a charged particle executing orbital motion, since there are 3 dimension space. and spin S , which is an internal degree of freedom to let particle “orbiting” at there.

thus, a general quantum state for a particle should not just for the spatial part and the time part. but also the spin, since a complete state should contains all degree of freedom.

\left| \Psi \right> = \left| x,t \right> \bigotimes \left| s \right>

when we “add” the orbital angular momentum and the spin together, actually, we are doing:

J = L \bigotimes 1 + 1 \bigotimes S

where the 1 with L is the identity of the spin-space and the 1 with S is the identity of the 3-D space.

the above was discussed on J.J. Sakurai’s book.

the mathematics of L and S are completely the same at rotation operator.

R_J (\theta) = Exp( - \frac {i}{\hbar} \theta J)

where J can be either L or S.

the L can only have effect on spatial state while S can only have effect on the spin-state. i.e:

R_L(\theta) \left| s \right> = \left| s\right>

R_S(\theta) \left| x \right> = \left| x\right>

the L_z can only have integral value but S_z can be both half-integral and integral. the half-integral value of Sz makes the spin-state have to rotate 2 cycles in order to be the same again.

thus, if the different of L and S is just man-made. The degree of freedom in the spin-space is actually by some real geometry on higher dimension. and actually, the orbital angular momentum can change the spin state:

L \left| s \right> = \left | s' \right > = c \left| s \right>

but the effect is so small and

R_L (\theta) \left| s\right > = Exp( - \frac {i}{\hbar} \theta c )\left| s \right>

but the c is very small, but if we can rotate the state for a very large angle, the effect of it can be seen by compare to the rotation by spin.

\left < R_L(\omega t) + R_S(\omega t) \right> = 2 ( 1+ cos ( \omega ( c -1 ) t)

the experiment can be done as follow. we apply a rotating magnetic field at the same frequency as the Larmor frequency. at a very low temperature, the spin was isolated and T_1 and T_2 is equal to \infty . the different in the c will come up at very long time measurement and it exhibit a interference pattern.

if c is a complex number, it will cause a decay, and it will be reflected in the interference pattern.

if we find out this c, then we can reveal the other spacial dimension!

___________________________________

the problem is. How can we act the orbital angular momentum on the spin with out the effect of spin angular momentum? since L and S always coupled.

one possibility is make the S zero. in the system of electron and positron. the total spin is zero.

another possibility is act the S on the spatial part. and this will change the energy level.

__________________________________

an more fundamental problem is, why L and S commute? the possible of writing this

\left| \Psi \right> = \left| x,t \right> \bigotimes \left| s \right>

is due to the operators are commute to each other. by why?

if we break down the L in to position operator x and momentum operator p, the question becomes, why x and S commute or p and S commute?

[x,S]=0 ?

[p,S]=0 ?

[p_x, S_y] \ne 0 ?

i will prove it later.

___________________________________

another problem is, how to evaluate the Poisson bracket? since L and S is not same dimension. may be we can write the eigenket in vector form:

\begin {pmatrix} \left|x, t \right> \\ \left|s\right> \end {pmatrix}

i am not sure.

 

___________________________________

For any vector operator, it must satisfy following equation, due to rotation symmetry.

[V_i, J_j] = i \hbar V_k   run in cyclic

Thus,

where J is rotation operator. but i am not sure is it restricted to real space rotation. any way, spin is a vector operator, thus

$latex [S_x, L_y] = i \hbar S_z = – [S_y, L_x] $

so, L, S is not commute.