Solving general eigenvalue problem

Leave a comment

Given a matrix H we can find it eigen value on a given basis set |\phi_i\rangle . Suppose the eigen vector is

|\psi \rangle = \sum_i \alpha_i |\phi_i\rangle

Put in the eigen equation

H|\psi\rangle = \epsilon |\psi\rangle \implies  \sum_i H|\phi_i\rangle \alpha_i = \epsilon \sum_i |\phi_i\rangle \alpha_i

We can act \langle \phi_k| on the left, but in general, the basis set are not orthogonal.

H_{ij} \alpha_j = \epsilon S_{ij} \alpha_j \to H\cdot \alpha = \epsilon S\cdot \alpha

H_{ij} = \langle \phi_i|H|\phi_j\rangle , S_{ij} =  \langle \phi_i|\phi_j\rangle

This is the General Eigen Value Problem.


One way to solve the problem, is the “reconfigure” the basis so that it is orthogonal. However, in computation, non-orthogonal basis could give supreme advantage. So, the other way is split the problem. First solve the S\cdot v = \sigma v,

The eigen system of S is

S = P^T \cdot \sigma \cdot P, P\cdot P^T = P^T \cdot P = I

Here, \sigma is a diagonal matrix of eigen value. Now, we define a new non-unitary matrix

\displaystyle R_{ij} = \frac{P_{ij}}{\sqrt{\sigma_j}}

Notices that R^T \neq R^{-1}

Thus,

R^T \cdot S\cdot R = I \implies S = R^{-T} \cdot R^{-1}


We know that, the form R^T \cdot Q \cdot R is a transform that from one basis to another basis, i.e.

|\hat{\phi}_i \rangle = R|\phi_i\rangle

|\psi \rangle = \sum_i \alpha_i |\phi_i\rangle = \sum_i \alpha R^{-1} |\hat{\phi}_i\rangle = \sum_i \beta_i |\hat{\phi}_i \rangle \implies \alpha = R\cdot \beta

and for any operator,

\hat{Q} = R^T\cdot Q\cdot R


We put this back to the general problem

H\cdot\alpha = \epsilon S\cdot \alpha  = \epsilon R^{-T} \cdot R^{-1} \alpha

\implies R^T\cdot H \cdot \alpha = \epsilon R^{-1} \alpha

\implies \hat{H}\cdot \beta = R^T\cdot H \cdot R \cdot \beta = \epsilon \beta

Thus, we can solve the \hat{H} , get the eigen system, then use R\cdot \beta = \alpha


For example,

\displaystyle \phi_1 = (1,0), \phi_2 = \frac{1}{\sqrt{2}}(1,1)

S = \begin{pmatrix} 1 & \frac{1}{\sqrt{2}} \\ \frac{1}{\sqrt{2}} & 1 \end{pmatrix}

The eigen system is

P =  \frac{1}{\sqrt{2}}\begin{pmatrix} 1 & -1 \\ 1 & 1 \end{pmatrix}

\sigma =  \begin{pmatrix} 1 + \frac{1}{\sqrt{2}} & 0 \\ 0 & 1 - \frac{1}{\sqrt{2}} \end{pmatrix}

The matrix R is

R =  \begin{pmatrix} \frac{1}{\sqrt{2+\sqrt{2}}} & -\frac{1}{\sqrt{2-\sqrt{2}}} \\ \frac{1}{\sqrt{2+\sqrt{2}}} & \frac{1}{\sqrt{2-\sqrt{2}}} \end{pmatrix}

We can verify that

R^T \cdot S \cdot R = I

R^{-T} \cdot  R^{-1} = S

R^T \cdot R = \begin{pmatrix} \frac{1}{2}(3-\sqrt{5}) & 0 \\ 0 & \frac{1}{2}(3+\sqrt{5}) \end{pmatrix} 

R \cdot R^T = \begin{pmatrix} 2 & -1 \\ -1 & 1 \end{pmatrix} 

Let a Hamilton is

H = \begin{pmatrix} 2 & 0 \\ 0 & 1 \end{pmatrix} 

The Hamiltonian in the basis is

H_S = [H_S]_{ij} = \phi_i \cdot H\cdot \phi_j =  \begin{pmatrix} 2 & 2 \\ 2 & 3 \end{pmatrix} 

The eigen values in the S frame are

\displaystyle \sigma_s = \frac{1}{2}(5 \pm \sqrt{17})

which is not the correct eigen values.

Now, we transform the Hamiltonian into orthogonal basis

\displaystyle H_R = R^T \cdot H_S \cdot R = \begin{pmatrix} \frac{15+\sqrt{5}}{10} & -\frac{1}{\sqrt{5}} \\ -\frac{1}{\sqrt{5}} & \frac{3}{2} - \frac{1}{2\sqrt{5}} \end{pmatrix}

The eigen values are

\sigma_\pm = 2, 1

where are the correct pair. The un-normalized eigen vector are

\displaystyle  \beta_{\pm} = \left(\frac{1}{2}(-1 \pm \sqrt{5}), 1\right)

We can verify that

\alpha_{\pm} = R\cdot \beta_{\pm}

H_S \cdot \alpha_\pm = \sigma_\pm S\cdot \alpha_\pm

 

Conjugate Prior

Leave a comment

Suppose we have a prior probability P(\theta), and we have a observation from the prior probability that the likelihood is L(\theta|x) = P(x| \theta) , thus, according to the Bayesian probability theory, the updated, posterior probability is

\displaystyle P(\theta |x) = \frac{P(x|\theta) P(\theta) }{ \int P(x|\theta') P(\theta') d\theta' = P(x)}

Here, P(\theta|x), P(\theta) are called conjugate distribution. P(\theta) is called conjugate prior to the likelihood P(x|theta) .

suppose we have no knowledge on the prior probability, it is a fair assumption that P(\theta) = 1 or uniform, then, the posterior probability is proportional to the likelihood function, i.e.

 P(\theta | x ) \propto L(\theta | x) = P(x|\theta) .

Now, suppose we know the prior probability, after updated with new information, if the prior probability is “stable”, the new (or posterior) probability should have the similar functional form as the odd (prior) probability!

When the likelihood function is Binomial distribution, it is found that the beta distribution is the “eigen” distribution that is unchanged after update.

\displaystyle P(\theta | a, b) = \frac{\theta^{(a-1)} (1-\theta)^{(b-1)}}{Beta(a,b) }

where Beta(a,b) is the beta function, which served as the normalization factor.

After s success trails and r failure trial, the posterior is

\displaystyle P(\theta | s, r) = \frac{\theta^{(s+a-1)} (1-\theta)^{(r+b-1)}}{Beta(s+a,r+b) }

When a = b = 1 \implies Beta(1,1) = 1, the posterior probability is reduced to binomial distribution and equal to the Likelihood function.

It is interesting to write the Bayesian equation in a “complete” form

\displaystyle P(\theta| s + a , r + b) \propto P(s + a , r+ b | \theta) P(\theta | a, b)  

Unfortunately, the beta distribution is undefined for a = 0 || b = 0, therefore, when no “prior” trials was taken, there is no “eigen” probability.

Remark: this topic is strongly related to the Laplace rule of succession.


Update: the reason for same functional form of prior and posterior is that the inference of mean, variant is more “consistence”.

 

Probability Limit of a null experiment

Leave a comment

Suppose we want to measure the probability of tail of a coin. And found that after 100 toss, no tail is appear, with is the upper limit of the probability of tail?


Suppose we have a coin, we toss it for 1 time, it show head, obviously, we cannot say the coin is biased or not, because the probability base on the result has not much information. So, we toss it 2 times, it shows head again, and if we toss it 10 time, it shows 10 heads. So, we are quite confident that the coin is biased to head, but how biased it is?

Or, we ask, what is the likelihood of the probability of head?


The probability of having r head in n toss, given that the probability of head is p is

P( r \textrm{head} | p ) = C^n_r p^r (1-p)^{n-r}

Now, we want to find out the “inverse”

P( p = x | r \textrm{head} )

This will generate a likelihood function.

L(x | r \textrm{head}) = P(  \textrm{head} | x ) = C^n_r x^r (1-x)^{n-r}

Capture.PNG


The peak position of the likelihood function for binomial distribution is

\displaystyle x_{\max}  = \frac{r}{n}

Since the distribution has a width, we can estimate the width by a given confident level.

From the graph, more the toss, the smaller of the width, and narrower of the uncertainly of the probability.

As the binomial distribution approach normal distribution when n \to \infty .

\displaystyle C^n_r p^r(1-p)^n \to \frac{1}{\sqrt{2\pi \sigma}} \exp\left( - \frac{(r- \mu)^2}{2\sigma^2} \right)

where \mu = n p and \sigma^2 = n p (1-p) .


When r = 0, no head was shown in n toss, the likelihood function is

L(x | n, 0) =  (1 - x )^n

Capture.PNG

The Full-width Half-maximum is

\displaystyle x_{\textrm{FWHM}} = 1- \left(\frac{1}{2}\right)^{\frac{1}{n}}

We can assign this is a “rough” the upper limit for the probability of head.

Covariant and Contravariant

Leave a comment

These concepts are very confusing. I made a diagram to help.

contra_co_variant.PNG

e, \hat{e} are two different basis or coordinate of a vector. The hat just an indicator for the difference. Many people use e', but I feel it is confusing when the prime enters the transformation elements \alpha.

The Direct transform of a basis is

\hat{e}_k = \alpha_{k}^{i} e_i,   \hat{e} = T\cdot e

The Inverse transform is

e = T^{-1} \cdot \hat{e} \implies \alpha_{i}^k \alpha_{k}^{j} = \delta_{ij}

Notice that the reciprocal basis transform can be different, \beta \neq \alpha .

\beta^l_j = g^{lk} \alpha_k^i g_{ij}

But for some cases they are equal, for example, if the basis rotated, the reciprocal basis also rotated by the same rotation.


A vector

\vec{v} = v^i e_i = v_i e^i = v \cdot e

In the change of basis,

\vec{v} = v \cdot T^{-1} \cdot T \cdot e

we can see the coordinate is always opposite to the change of basis. When the basis is under Direct transform, the coordinate is under Inverse transform, therefore, it is contravariant.

on general quantum propagator

Leave a comment

A general quantum propagator take the exponential form,

D(\epsilon) = \exp(-i \epsilon P),

where \epsilon has a unit with q , which is a general coordinate, and P is the Hermitian operator of a coordinate p, such that

\displaystyle P |q \rangle = -i \frac{d}{dq} |q \rangle ,

\displaystyle p = \frac{dL}{d\dot{q}},  

where L is the system Lagrangian.


I discuss this topic on a previous post. I found that there is not a strong reason that a unitary operator has to be exponential. I mean, it can, does not mean it must.

I ask further,

Why a general quantum propagator take the exponential form? 


Consider the P is a GENERATOR of movement, so that

\displaystyle P |q \rangle = -i \frac{d}{dq} |q \rangle ,

The above relation means that the amount or magnitude of the first order change of q is i P .

Thus, the first order of the propagator is

D^{(1)}(\epsilon) |q \rangle = |q + \epsilon \rangle = (1 + i\epsilon P )|q \rangle

The higher order change, using Taylor series of |q + \epsilon \rangle around q , we can expect the complete form of the propagator should be taking exponential form.


This approach to understand this is kind of reverse, compare to the previous post.

The answer the that question is:

Since the canonical momentum is the magnitude of the first order change of the canonical position. According to Taylor series, the canonical translation operator must take an exponential form. 

Does the existence of Planck constant indicate that we are living in a computer simulation?

Leave a comment

Every time I come across with approximation, I feel kind of reluctant and uncomfortable, especially in theory. For example, The rotation operator usually introduced using

D(\delta\omega) = 1 + \delta\omega J_z

And any higher order terms like \delta\omega^n, n>1 will be neglected. Claiming that those term are too small to consider.

What if, the real world has infinite digit to store an information? such that those higher order terms are not NEGLECTABLE at all.


We know that there are Planck scale, or Planck unit for almost all physical quantities, like Planck length, Planck time, Planck constant for energy and so on. Everything smaller than those unit are “undefine” or “not known”. Therefore, the higher order terms are safely be neglected as the \delta is already “infinitesimal”.

This situation is similar to computer that we have finite digit to store a number, everything smaller than the smallest digit will be neglected. That causing floating point error. Therefore, in high precision calculation, the memory has to be very large.


These connection, make me wonder,

Does the existence of Planck constant indicate that we are living in a computer simulation?

Implementation of Rotation

Leave a comment

A vector \vec{r} in space has to be expressed using basis \hat{e} as a reference and coordinate \vec{v}, such that,

\displaystyle \vec{r} = \hat{e} \cdot \vec{v} = \sum_{i} \hat{e}_i v_i

A transform of a vector R(\vec{r}) is done by

\vec{r'} = R(\vec{r}) = \hat{e} \cdot R \cdot \vec{v}

This transform can be view as

(\hat{e}\cdot R) \cdot \vec{v} = \hat{e} \cdot ( R \cdot \vec{v})

The first one is change the basis and keep the coordinate, and the later is change the coordinate and keep the basis.


The Euler angle, defined as,

\vec{r'} = \hat{e} \cdot R_{z}(\alpha) \cdot R_{y}(\beta) \cdot R_{z}(\gamma) \cdot \vec{v}

Since the rotational matrix is fixed and not based on any basis, how to understand this transform can be viewed differently.

The implementation of the rotation usually like this:

\vec{v'} = R_{z}(\alpha) \cdot R_{y}(\beta) \cdot R_{z}(\gamma) \cdot \vec{v}

And the rotation order is \alpha \to \beta \to \gamma. In this order, the rotation is using body (or current, intrinsic) axis.

If the rotation order is \gamma \to \beta \to \alpha, the rotation is using fixed (or global, external, Laboratory) axis.

To understand this, we have to add back the basis \hat{e}. When using the body axis, the left most matrix actually act on the basis (operationally)

\hat{e}\cdot R_{z}(\alpha) \cdot R_{y}(\beta) \cdot \vec{v} \to \hat{e'} \cdot R_y(\beta) \cdot \vec{v}

When using the fixed axis,

\hat{e}\cdot R_{z}(\alpha) \cdot R_{y}(\beta) \cdot \vec{v} \to \hat{e} \cdot R_z(\alpha) \cdot \vec{v'} \to \hat{e} \cdot \vec{v''}


Thus, if we using Euler angle \alpha \to \beta \to \gamma in fixed frame, the correct matrix is

R_{z}(\gamma) R_y(\beta) R_z(\alpha)


In the following, we use fixed frame, unless specified. 


Since using Euler angle \alpha \to \beta \to \gamma in changing frame, the rotation \gamma is fixed on the new axis \hat{n}(\theta = \beta , \phi = \alpha) . However, the matrix

R(\hat{n}, \omega) = R_z(\phi) \cdot R_y(\theta)\cdot R_z(\omega)

is NOT a rotation on the axis with the angle. As we can see that, for a zero rotation \omega = 0 , any vector will still be transformed by R_z(\phi) \cdot R_y(\theta) .


Now, suppose we have a rotation axis \hat{n} and an rotation angle \omega , the rotation matrix can be constructed as follow.

Notice that the rotational matrix R is a unitary transform, its eigen-vectors are P,

\displaystyle R\cdot P = P \cdot \begin{pmatrix} 1 & 0 & 0 \\ 0 & a + i b & 0 \\ 0 & 0 & a-i b \end{pmatrix}

a^2 + b^2 = 1

|\textrm{Arg}(a + i b)| = \omega

Using \omega, solve a = \cos(\omega) , b = \sin(\omega).

Since the eigen vector of R_z(\alpha) are

\displaystyle p_0 = \hat{e_z}, p_{\pm} = \frac{1}{2}(\hat{e_x}\pm i \hat{e_y})

Since p_0 \to \hat{n}(\theta, \phi), thus,

\displaystyle p_{\pm} \to \frac{1}{2}( \hat{e_x'} \pm i \hat{e_y'} )

such that

\hat{e_i'} = R_z(\phi)\cdot R_y(\theta) \cdot \hat{e_i}

Notice that the corresponding eigen vector for the eigen value a \pm i b is p_{\mp}.

Then the rotational matrix is

R = P\cdot D\cdot P^{-1}


Another to look at this problem is that, an operator, can be “decomposed” into its eigen vectors and eigen-values, using Dirac notation,

\displaystyle R(\omega) = \sum_{i} |v_i \rangle \omega_i \langle v_i |

The rotation round the vector can be transformed use R_z(\omega) from (0,0) \to (\theta, \phi) , Thus, the vector  has to be transform, so the operator

\displaystyle R(\hat{n}, \omega) = \sum_{i} R_z(\phi) R_y(\theta) |v_i \rangle \omega_i \langle v_i | R_y(-\theta) R_z(-\phi)

R(\hat{n}, \omega) = R_z(\phi) \cdot R_y(\theta) \cdot  R_z(\omega) \cdot R_y(-\theta) \cdot R_z(-\phi)

This is a relation between fixed frame rotation to a body rotation! \hat{n} can be the body z-axis, y-axis, or x-axis in particular.

For example, Let \hat{n} be the rotated y-axis, such that

\displaystyle \hat{n} = R_z(\alpha) \cdot \hat{y} , (\theta, \phi) = (\frac{\pi}{2}, \alpha+\frac{\pi}{2})

\displaystyle R(\hat{y'}, \omega) = R_z(\alpha+\frac{\pi}{2}) \cdot R_y(\frac{\pi}{2}) \cdot R_z(\omega) \cdot R_y(-\frac{\pi}{2}) \cdot R_z( -\alpha - \frac{\pi}{2})

Notice that

R_y(\frac{\pi}{2}) \cdot R_z(\omega) \cdot R_y(-\frac{\pi}{2}) = R_x(\omega)

R_z(\frac{\pi}{2}) \cdot R_x(\omega) \cdot R_z(-\frac{\pi}{2}) = R_y(\omega)

The above formula are not difficult to imagine, decompose the middle operator into the eigen-system and use Dirac notation. For example, the rotate of the z-axis around y-axis with 90 degree is the x-axis. Thus,

\displaystyle R(\hat{y'}, \omega) =R_{yB}(\omega) = R_z(\alpha) \cdot R_y(\omega) \cdot R_z(-\alpha)

We recovered the relation of the Lab frame to body frame for y-axis!


Also, given a vector \vec{r} rotated by \omega to \vec{r'} around unit vector \hat{n} is

\vec{r'} = \vec{r} \cos(\omega) + \hat{n} (\hat{n}\cdot \vec{r}) (1-\cos(\omega)) + \hat{n} \times \vec{r} \sin(\omega)

Older Entries