## Multi-dimension Linear Regression

In the field of science, collecting data and fitting it with model is essential. The most common type of fitting is 1-dimensional fitting, as there is only one independent variable. By fitting, we usually mean the least-squared method.

Suppose we want to find the n parameters in a linear function

$f(x_1, x_2,\cdots, x_n) = \sum_{i=1} a_i x_i$

with m observed experimental data

$Y_j = f(x_{1j}, x_{2j}, \cdot, x_{nj} + \epsilon_j= \sum_{i=1} a_i x_{ij}+ \epsilon_j$

Thus, we have a matrix equation

$Y=X \cdot A + \epsilon$

where $Y$ is a m-dimensional data column vector, $A$ is a n-dimensional parameter column vector, and $X$ is a n-m non-square matrix.

In order to get the $n$ parameter, the number of data $m >= n$. when $m=n$, it is not really a fitting because of degree-of-freedom is $DF = m-n = 0$, so that the fitting error is infinity.

The least square method in matrix algebra is like calculation. Take both side with transpose of $X$

$X^T \cdot Y = (X^T \cdot X) \cdot A + X^T \cdot \epsilon$

$(X^T\cdot X)^{-1} \cdot X^T \cdot Y = A + (X^T \cdot X)^{-1} \cdot X^T \cdot \epsilon$

Since the expectation of the $\epsilon$ is zero. Thus the expected parameter is

$A = (X^T \cdot X)^{-1} \cdot X^T \cdot Y$

The unbiased variance is

$\sigma^2 = (Y - X\cdot A)^T \cdot (Y - X\cdot A) / DF$

where $DF$ is the degree of freedom, which is the number of value that are free to vary. Many people will confuse by the “-1” issue. In fact, if you only want to calculate the sum of square of residual SSR, the degree of freedom is always $m - n$.

The covariance of the estimated parameters is

$Var(A) = \sigma^2 (X^T\cdot X)^{-1}$

This is only a fast-food notices on the linear regression. This has a geometrical meaning  that the matrix $X$ is the sub-space of parameters with basis formed by the column vectors of $X$. $Y$ is a bit out-side the sub-space. The linear regression is a method to find the shortest distance from $Y$ to the sub-space $X$.

The from of the variance can be understood using Taylor series. This can be understood using variance in matrix notation $Var(A) = E( A - E(A) )^T \cdot E(A - E(A))$.

## Goodness of Fit

We assumed each data point is taking from a distribution with mean $\mu$ and variance $\sigma^2$

$Y\sim D(\mu, \sigma^2)$

in which, the mean can be a function of X.

For example, we have a data $Y_i$, it has relation with an independent variable $X_i$. We would like to know the relationship between $Y_i$ and $X_i$, so we fit a function $y = f(x)$.

After the fitting (least square method), we will have so residual for each of the data

$e_i = y_i - Y_i$

This residual  should be follow the distribution

$e \sim D(0, \sigma_e^2)$

The goodness of fit, is a measure, to see the distribution of the residual, agree with the experimental error of each point, i.e. $\sigma$

Thus, we would like to divide the residual with $\sigma$ and define the chi-squared

$\chi^2 = (\sum (e_i^2)/\sigma_{e_i}^2 )$.

we can see, the distribution of

$e/\sigma_e \sim D(0, 1)$

and the sum of this distribution would be the chi-squared distribution. It has a mean of the degree of freedom $DF$. Note that the mean and the peak of the chi-squared distribution is not the same that the peak at  $DF-1$.

In the case we don’t know the error, then, the sample variance of the residual is out best estimator of the true variance. The unbiased sample variance is

$\sigma_s^2 = Var(e)/DF$,

where $DF$ is degree of freedom. In the cause of $f(x) = a x + b$, the $DF = n-1$, because there is 1 degree o freedom used in x. And because the 1  with the b is fixed, it provides no degree of freedom.

## on angular momentum adding & rotation operator

the angular momentum has 2 kinds – orbital angular momentum $L$, which is caused by a charged particle executing orbital motion, since there are 3 dimension space. and spin $S$, which is an internal degree of freedom to let particle “orbiting” at there.

thus, a general quantum state for a particle should not just for the spatial part and the time part. but also the spin, since a complete state should contains all degree of freedom.

$\left| \Psi \right> = \left| x,t \right> \bigotimes \left| s \right>$

when we “add” the orbital angular momentum and the spin together, actually, we are doing:

$J = L \bigotimes 1 + 1 \bigotimes S$

where the 1 with L is the identity of the spin-space and the 1 with S is the identity of the 3-D space.

the above was discussed on J.J. Sakurai’s book.

the mathematics of $L$ and $S$ are completely the same at rotation operator.

$R_J (\theta) = Exp( - \frac {i}{\hbar} \theta J)$

where $J$ can be either $L$ or $S$.

the $L$ can only have effect on spatial state while $S$ can only have effect on the spin-state. i.e:

$R_L(\theta) \left| s \right> = \left| s\right>$

$R_S(\theta) \left| x \right> = \left| x\right>$

the $L_z$ can only have integral value but $S_z$ can be both half-integral and integral. the half-integral value of $Sz$ makes the spin-state have to rotate 2 cycles in order to be the same again.

thus, if the different of $L$ and $S$ is just man-made. The degree of freedom in the spin-space is actually by some real geometry on higher dimension. and actually, the orbital angular momentum can change the spin state:

$L \left| s \right> = \left | s' \right > = c \left| s \right>$

but the effect is so small and

$R_L (\theta) \left| s\right > = Exp( - \frac {i}{\hbar} \theta c )\left| s \right>$

but the c is very small, but if we can rotate the state for a very large angle, the effect of it can be seen by compare to the rotation by spin.

$\left < R_L(\omega t) + R_S(\omega t) \right> = 2 ( 1+ cos ( \omega ( c -1 ) t)$

the experiment can be done as follow. we apply a rotating magnetic field at the same frequency as the Larmor frequency. at a very low temperature, the spin was isolated and $T_1$ and $T_2$ is equal to $\infty$. the different in the c will come up at very long time measurement and it exhibit a interference pattern.

if $c$ is a complex number, it will cause a decay, and it will be reflected in the interference pattern.

if we find out this c, then we can reveal the other spacial dimension!

___________________________________

the problem is. How can we act the orbital angular momentum on the spin with out the effect of spin angular momentum? since L and S always coupled.

one possibility is make the S zero. in the system of electron and positron. the total spin is zero.

another possibility is act the S on the spatial part. and this will change the energy level.

__________________________________

an more fundamental problem is, why L and S commute? the possible of writing this

$\left| \Psi \right> = \left| x,t \right> \bigotimes \left| s \right>$

is due to the operators are commute to each other. by why?

if we break down the L in to position operator x and momentum operator p, the question becomes, why x and S commute or p and S commute?

$[x,S]=0 ?$

$[p,S]=0 ?$

$[p_x, S_y] \ne 0 ?$

i will prove it later.

___________________________________

another problem is, how to evaluate the Poisson bracket? since L and S is not same dimension. may be we can write the eigenket in vector form:

$\begin {pmatrix} \left|x, t \right> \\ \left|s\right> \end {pmatrix}$

i am not sure.

___________________________________

For any vector operator, it must satisfy following equation, due to rotation symmetry.

$[V_i, J_j] = i \hbar V_k$   run in cyclic

Thus,

where J is rotation operator. but i am not sure is it restricted to real space rotation. any way, spin is a vector operator, thus

$latex [S_x, L_y] = i \hbar S_z = – [S_y, L_x]$

so, L, S is not commute.