4.5 HW 5

  4.5.1 Problems listing
  4.5.2 Problem 1 a (9.1.6)
  4.5.3 Problem 1 b (9.2.1 (ii))
  4.5.4 Problem 1 c (9.2.3)
  4.5.5 Problem 2
  4.5.6 Problem 9.2.5
  4.5.7 Problem 9.3.5
  4.5.8 Problem 9.5.6
  4.5.9 Problem 9.5.10
  4.5.10 key solution for HW 5

4.5.1 Problems listing

PDF

PDF (letter size)
PDF (legal size)

4.5.2 Problem 1 a (9.1.6)

   4.5.2.1 Part 1
   4.5.2.2 Part 2

Show that the following row vectors are linearly dependent. \(\begin {pmatrix} 1 & 1 & 0 \end {pmatrix} ,\begin {pmatrix} 1 & 0 & 1 \end {pmatrix} ,\begin {pmatrix} 3 & 2 & 1 \end {pmatrix} \). Show the opposite for \(\begin {pmatrix} 1 & 1 & 0 \end {pmatrix} ,\begin {pmatrix} 1 & 0 & 1 \end {pmatrix} ,\begin {pmatrix} 0 & 1 & 1 \end {pmatrix} \).

Solution

4.5.2.1 Part 1

Vectors \(\vec {V}_{1},\vec {V}_{2},\vec {V}_{2}\) are Linearly dependent if we can find \(a,b,c\) not all zero, such that \[ a\vec {V}_{1}+b\vec {V}_{2}+c\vec {V}_{2}=\vec {0}\] Applying the above to the vectors we are given gives\begin {align} a\begin {pmatrix} 1\\ 1\\ 0 \end {pmatrix} +b\begin {pmatrix} 1\\ 0\\ 1 \end {pmatrix} +c\begin {pmatrix} 3\\ 2\\ 1 \end {pmatrix} & =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \nonumber \\\begin {pmatrix} 1 & 1 & 3\\ 1 & 0 & 2\\ 0 & 1 & 1 \end {pmatrix}\begin {pmatrix} a\\ b\\ c \end {pmatrix} & =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \nonumber \\ Ax & =0\tag {1} \end {align}

One way is to find \(\det \relax (A) \). If \(\det \relax (A) =0\) then there exists non-trivial solution \(x\). Which means linearly dependent, otherwise linearly independent.

\[ \det \relax (A) =1\begin {vmatrix} 0 & 2\\ 1 & 1 \end {vmatrix} -1\begin {vmatrix} 1 & 2\\ 0 & 1 \end {vmatrix} +3\begin {vmatrix} 1 & 0\\ 0 & 1 \end {vmatrix} =-2-1+3=0 \]

Since \(\det \relax (A) =0\) then linearly dependent\(.\)

Another method is to actually solve for \(\begin {pmatrix} a\\ b\\ c \end {pmatrix} \) to see if we can obtain non zero solution or not. Using Gaussian elimination

\(R_{2}=R_{2}-R_{1}\)\[\begin {pmatrix} 1 & 1 & 3\\ 0 & -1 & -1\\ 0 & 1 & 1 \end {pmatrix} \] \(R_{3}=R_{3}+R_{2}\)\[\begin {pmatrix} 1 & 1 & 3\\ 0 & -1 & -1\\ 0 & 0 & 0 \end {pmatrix} \] Hence the system becomes\[\begin {pmatrix} 1 & 1 & 3\\ 0 & -1 & -1\\ 0 & 0 & 0 \end {pmatrix}\begin {pmatrix} a\\ b\\ c \end {pmatrix} =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \] Last row show that \(c\) is free variable. Hence it can be any value. Second row gives \(-b-c=0\) or \(b=-c\). First row gives \(a+b+3c=0\) or \(a=-b-3c=c-3c=-2c\). Therefore the solution is \begin {align*} \begin {pmatrix} a\\ b\\ c \end {pmatrix} & =\begin {pmatrix} -2c\\ -c\\ c \end {pmatrix} \\ & =c\begin {pmatrix} -2\\ -1\\ 1 \end {pmatrix} \end {align*}

There are infinite number of solutions. Let \(c=1\). Hence one solution is \[\begin {pmatrix} a\\ b\\ c \end {pmatrix} =\begin {pmatrix} -2\\ -1\\ 1 \end {pmatrix} \] Since we found \(a,b,c\) not all zero which makes \(a\vec {V}_{1}+b\vec {V}_{2}+c\vec {V}_{2}=\vec {0}\), then the vectors are Linearly dependent .

4.5.2.2 Part 2

Vectors \(\vec {V}_{1},\vec {V}_{2},\vec {V}_{2}\) are Linearly independent if the only solution to \[ a\vec {V}_{1}+b\vec {V}_{2}+c\vec {V}_{2}=\vec {0}\] is when \(a=b=c=0\). As in part 1, we setup \(Ax=0\) system and solve it to find out. \begin {align} a\begin {pmatrix} 1\\ 1\\ 0 \end {pmatrix} +b\begin {pmatrix} 1\\ 0\\ 1 \end {pmatrix} +c\begin {pmatrix} 0\\ 1\\ 1 \end {pmatrix} & =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \nonumber \\\begin {pmatrix} 1 & 1 & 0\\ 1 & 0 & 1\\ 0 & 1 & 1 \end {pmatrix}\begin {pmatrix} a\\ b\\ c \end {pmatrix} & =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \nonumber \\ Ax & =0\tag {2} \end {align}

One way is to find \(\det \relax (A) \). If \(\det \relax (A) =0\) then there exists non-trivial solution \(x\). Which means linearly dependent, otherwise linearly independent.

\[ \det \relax (A) =1\begin {vmatrix} 0 & 1\\ 1 & 1 \end {vmatrix} -1\begin {vmatrix} 1 & 1\\ 0 & 1 \end {vmatrix} =-1-1=-2 \]

Since \(\det \relax (A) \neq 0\) then linearly independent

Another method is to solve (2) directly. Using Gaussian elimination gives

\(R_{2}=R_{2}-R_{1}\)\[\begin {pmatrix} 1 & 1 & 0\\ 0 & -1 & 1\\ 0 & 1 & 1 \end {pmatrix} \] \(R_{3}=R_{3}+R_{2}\)\[\begin {pmatrix} 1 & 1 & 0\\ 0 & -1 & 1\\ 0 & 0 & 2 \end {pmatrix} \] Hence the system becomes\[\begin {pmatrix} 1 & 1 & 0\\ 0 & -1 & 1\\ 0 & 0 & 2 \end {pmatrix}\begin {pmatrix} a\\ b\\ c \end {pmatrix} =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \] Last row gives \(c=0\). Second row gives \(-b+c=0\) or \(b=0\). First row gives \(a+b=0\) or \(a=0\). Hence the solution is \[\begin {pmatrix} a\\ b\\ c \end {pmatrix} =\begin {pmatrix} 0\\ 0\\ 0 \end {pmatrix} \] Therefore \(a\vec {V}_{1}+b\vec {V}_{2}+c\vec {V}_{2}=\vec {0}\) implies that \(a=b=c=0\), then the vectors are Linearly independent .

4.5.3 Problem 1 b (9.2.1 (ii))

Repeat the above calculation of expanding the vector in Eqn (9.2.32) but in the following basis, after first demonstrating its orthonormality. At the end check that the norm squared of the vector comes out to be \(6\).\begin {align*} |I\rangle & =\begin {pmatrix} \frac {1+i\sqrt {3}}{4}\\ -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} \\ |II\rangle & =\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\\ \frac {\sqrt {3}+i}{4}\end {pmatrix} \end {align*}

The vector is \begin {equation} |V\rangle =\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \tag {9.2.32} \end {equation} Solution

First we need to check the basis given are orthogonal to each others, and each have norm of \(1\) each. To check orthogonality\begin {align*} \langle I|II\rangle & =\begin {pmatrix} \frac {1+i\sqrt {3}}{4} & -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} ^{\ast }\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\\ \frac {\sqrt {3}+i}{4}\end {pmatrix} \\ & =\begin {pmatrix} \frac {1-i\sqrt {3}}{4} & -\frac {\sqrt {3}\left (1-i\right ) }{\sqrt {8}}\end {pmatrix}\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\\ \frac {\sqrt {3}+i}{4}\end {pmatrix} \\ & =\left (\frac {\left (1-i\sqrt {3}\right ) }{4}\right ) \left (\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\right ) +\left (-\frac {\sqrt {3}\left (1-i\right ) }{\sqrt {8}}\right ) \left (\frac {\sqrt {3}+i}{4}\right ) \\ & =\frac {\left (1-i\sqrt {3}\right ) \left (\sqrt {3}+i\sqrt {3}\right ) }{4\sqrt {8}}-\frac {\left (\sqrt {3}-\sqrt {3}i\right ) \left (\sqrt {3}+i\right ) }{4\sqrt {8}}\\ & =\frac {\sqrt {3}+i\sqrt {3}-3i+3}{4\sqrt {8}}-\frac {3+\sqrt {3}i-3i+\sqrt {3}}{4\sqrt {8}}\\ & =0 \end {align*}

Since dot product is zero, then they are orthogonal to each others. To check the norm\begin {align*} \langle I|I\rangle & =\begin {pmatrix} \frac {1+i\sqrt {3}}{4} & -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} ^{\ast }\begin {pmatrix} \frac {1+i\sqrt {3}}{4}\\ -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} \\ & =\begin {pmatrix} \frac {1-i\sqrt {3}}{4} & -\frac {\sqrt {3}\left (1-i\right ) }{\sqrt {8}}\end {pmatrix}\begin {pmatrix} \frac {1+i\sqrt {3}}{4}\\ -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} \\ & =\left (\frac {\left (1-i\sqrt {3}\right ) }{4}\right ) \left ( \frac {\left (1+i\sqrt {3}\right ) }{4}\right ) +\left (-\frac {\sqrt {3}\left ( 1-i\right ) }{\sqrt {8}}\right ) \left (-\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\right ) \\ & =\frac {\left (1-i\sqrt {3}\right ) \left (1+i\sqrt {3}\right ) }{16}+\frac {\left (\sqrt {3}-i\sqrt {3}\right ) \left (\sqrt {3}+i\sqrt {3}\right ) }{8}\\ & =\frac {1+3}{16}+\frac {3+3}{8}\\ & =\frac {4}{16}+\frac {6}{8}\\ & =1 \end {align*}

Since \(\langle I|I\rangle =\left \Vert I\right \Vert ^{2}\) then \(\left \Vert I\right \Vert ^{2}=1\) which means \(\left \Vert I\right \Vert =1\). Now we do the same for the second basis\begin {align*} \langle II|II\rangle & =\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}} & \frac {\sqrt {3}+i}{4}\end {pmatrix} ^{\ast }\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\\ \frac {\sqrt {3}+i}{4}\end {pmatrix} \\ & =\begin {pmatrix} \frac {\sqrt {3}-i\sqrt {3}}{\sqrt {8}} & \frac {\sqrt {3}-i}{4}\end {pmatrix}\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\\ \frac {\sqrt {3}+i}{4}\end {pmatrix} \\ & =\left (\frac {\sqrt {3}-i\sqrt {3}}{\sqrt {8}}\right ) \left (\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\right ) +\left (\frac {\sqrt {3}-i}{4}\right ) \left (\frac {\sqrt {3}+i}{4}\right ) \\ & =\frac {\left (\sqrt {3}-i\sqrt {3}\right ) \left (\sqrt {3}+i\sqrt {3}\right ) }{8}+\frac {\left (\sqrt {3}-i\right ) \left (\sqrt {3}+i\right ) }{16}\\ & =\frac {3+3}{8}+\frac {3+1}{16}\\ & =\frac {6}{8}+\frac {4}{16}\\ & =1 \end {align*}

which means \(\left \Vert II\right \Vert =1\). We finished showing the basis are orthonormal. Now we express the vector \(|V\rangle =\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \) in these basis. Let\[ |V\rangle =v_{1}|I\rangle +v_{2}|II\rangle \] To find \(v_{1}\), we take dot product of both sides w.r.t \(|I\rangle \). This gives\[ \langle I|V\rangle =v_{1}\langle I|I\rangle \] But \(\langle I|I\rangle =1\). Hence \begin {align*} v_{1} & =\langle I|V\rangle \\ & =\begin {pmatrix} \frac {1+i\sqrt {3}}{4} & -\frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}}\end {pmatrix} ^{\ast }\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \\ & =\begin {pmatrix} \frac {1-i\sqrt {3}}{4} & \frac {-\sqrt {3}+i\sqrt {3}}{\sqrt {8}}\end {pmatrix}\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \\ & =\frac {\left (1-i\sqrt {3}\right ) \left (1+i\right ) }{4}+\left ( \frac {-\sqrt {3}+i\sqrt {3}}{\sqrt {8}}\right ) \left (\sqrt {3}+i\right ) \\ & =\frac {1+i-i\sqrt {3}+\sqrt {3}}{4}+\frac {-3-\sqrt {3}i+3i-\sqrt {3}}{\sqrt {8}}\\ & =\frac {\sqrt {8}\left (1+i-i\sqrt {3}+\sqrt {3}\right ) +4\left (-3-\sqrt {3}i+3i-\sqrt {3}\right ) }{4\sqrt {8}}\\ & =\frac {\sqrt {8}+\sqrt {8}i-i\sqrt {24}+\sqrt {24}-12-4\sqrt {3}i+12i-4\sqrt {3}}{4\sqrt {8}}\\ & =\frac {\sqrt {8}+\sqrt {24}-12-4\sqrt {3}}{4\sqrt {8}}+i\frac {\sqrt {8}-\sqrt {24}-4\sqrt {3}+12}{4\sqrt {8}}\\ & =\frac {1}{4}\left (1+\sqrt {3}-\frac {12}{\sqrt {8}}-\frac {4\sqrt {3}}{\sqrt {8}}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\frac {4\sqrt {3}}{\sqrt {8}}+\frac {12}{\sqrt {8}}\right ) \\ & =\frac {1}{4}\left (1+\sqrt {3}-\frac {12}{2\sqrt {2}}-\frac {4\sqrt {3}}{2\sqrt {2}}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\frac {4\sqrt {3}}{2\sqrt {2}}+\frac {12}{2\sqrt {2}}\right ) \\ & =\frac {1}{4}\left (1+\sqrt {3}-\frac {6}{\sqrt {2}}-\frac {2\sqrt {3}}{\sqrt {2}}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\frac {2\sqrt {3}}{\sqrt {2}}+\frac {6}{\sqrt {2}}\right ) \\ & =\frac {1}{4}\left (1+\sqrt {3}-3\sqrt {2}-\sqrt {6}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\sqrt {6}+3\sqrt {2}\right ) \end {align*}

And\begin {align*} v_{2} & =\langle II|V\rangle \\ & =\begin {pmatrix} \frac {\sqrt {3}\left (1+i\right ) }{\sqrt {8}} & \frac {\sqrt {3}+i}{4}\end {pmatrix} ^{\ast }\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \\ & =\begin {pmatrix} \frac {\sqrt {3}\left (1-i\right ) }{\sqrt {8}} & \frac {\sqrt {3}-i}{4}\end {pmatrix}\begin {pmatrix} 1+i\\ \sqrt {3}+i \end {pmatrix} \\ & =\frac {\left (\sqrt {3}-i\sqrt {3}\right ) \left (1+i\right ) }{\sqrt {8}}+\left (\frac {\sqrt {3}-i}{4}\right ) \left (\sqrt {3}+i\right ) \\ & =\frac {\sqrt {3}+i\sqrt {3}-i\sqrt {3}+\sqrt {3}}{\sqrt {8}}+\frac {3+\sqrt {3}i-i\sqrt {3}+1}{4}\\ & =\frac {2\sqrt {3}}{\sqrt {8}}+\frac {3+1}{4}\\ & =\frac {2\sqrt {3}}{\sqrt {8}}+1\\ & =\frac {2\sqrt {3}}{2\sqrt {2}}+1\\ & =1+\sqrt {\frac {3}{2}} \end {align*}

Hence \begin {align*} |V\rangle & =\left (\frac {1}{4}\left (1+\sqrt {3}-3\sqrt {2}-\sqrt {6}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\sqrt {6}+3\sqrt {2}\right ) \right ) |I\rangle +\left (1+\sqrt {\frac {3}{2}}\right ) |II\rangle \\ & \equiv \begin {pmatrix} \frac {1}{4}\left (1+\sqrt {3}-3\sqrt {2}-\sqrt {6}\right ) +i\frac {1}{4}\left ( 1-\sqrt {3}-\sqrt {6}+3\sqrt {2}\right ) \\ 1+\sqrt {\frac {3}{2}}\end {pmatrix} \end {align*}

Now we check the square of the of norm of \(|V\rangle \)\begin {align*} \left \Vert V\right \Vert ^{2} & =\left \Vert \frac {1}{4}\left (1+\sqrt {3}-3\sqrt {2}-\sqrt {6}\right ) +i\frac {1}{4}\left (1-\sqrt {3}-\sqrt {6}+3\sqrt {2}\right ) \right \Vert ^{2}+\left \Vert 1+\sqrt {\frac {3}{2}}\right \Vert ^{2}\\ & =\left (\frac {1}{4}\left (1+\sqrt {3}-3\sqrt {2}-\sqrt {6}\right ) \right ) ^{2}+\left (\frac {1}{4}\left (1-\sqrt {3}-\sqrt {6}+3\sqrt {2}\right ) \right ) ^{2}+\left (1+\sqrt {\frac {3}{2}}\right ) ^{2}\\ & =6 \end {align*}

Verified.

4.5.4 Problem 1 c (9.2.3)

Show how to go from the basis \[ |I\rangle =\begin {bmatrix} 3\\ 0\\ 0 \end {bmatrix} \qquad |II\rangle =\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} \qquad |III\rangle =\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} \] To the orthonormal basis\[ |1\rangle =\begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \qquad |2\rangle =\begin {bmatrix} 0\\ \frac {1}{\sqrt {5}}\\ \frac {2}{\sqrt {5}}\end {bmatrix} \qquad |3\rangle =\begin {bmatrix} 0\\ \frac {-2}{\sqrt {5}}\\ \frac {1}{\sqrt {5}}\end {bmatrix} \] Solution

Using Gram-Schmidt method, let \(|1\rangle =\frac {|I\rangle }{\left \Vert I\right \Vert }=\begin {bmatrix} 3\\ 0\\ 0 \end {bmatrix} \frac {1}{3}=\begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \). Now \begin {align*} |2^{\prime }\rangle & =|II\rangle -|1\rangle \langle 1|II\rangle \\ & =\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} -\begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \left ( \begin {bmatrix} 1 & 0 & 0 \end {bmatrix} ^{\ast }\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} \right ) \\ & =\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} -\begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \relax (0) \\ & =\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} \end {align*}

Hence \[ |2\rangle =\frac {|2^{\prime }\rangle }{\left \Vert 2^{\prime }\right \Vert }=\begin {bmatrix} 0\\ 1\\ 2 \end {bmatrix} \frac {1}{\sqrt {1+4}}=\begin {bmatrix} 0\\ \frac {1}{\sqrt {5}}\\ \frac {2}{\sqrt {5}}\end {bmatrix} \] And\begin {align*} |3^{\prime }\rangle & =|III\rangle -\left (|1\rangle \langle 1|III\rangle +|2\rangle \langle 2|III\rangle \right ) \\ & =\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} -\left ( \begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \left ( \begin {bmatrix} 1 & 0 & 0 \end {bmatrix} ^{\ast }\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} \right ) +\begin {bmatrix} 0\\ \frac {1}{\sqrt {5}}\\ \frac {2}{\sqrt {5}}\end {bmatrix} \left ( \begin {bmatrix} 0 & \frac {1}{\sqrt {5}} & \frac {2}{\sqrt {5}}\end {bmatrix} ^{\ast }\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} \right ) \right ) \\ & =\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} -\left ( \begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \relax (0) +\begin {bmatrix} 0\\ \frac {1}{\sqrt {5}}\\ \frac {2}{\sqrt {5}}\end {bmatrix} \frac {12}{\sqrt {5}}\right ) \\ & =\begin {bmatrix} 0\\ 2\\ 5 \end {bmatrix} -\begin {bmatrix} 0\\ \frac {12}{5}\\ \frac {24}{5}\end {bmatrix} \\ & =\begin {bmatrix} 0\\ 2-\frac {12}{5}\\ 5-\frac {24}{5}\end {bmatrix} \\ & =\begin {bmatrix} 0\\ -\frac {2}{5}\\ \frac {1}{5}\end {bmatrix} \end {align*}

Hence\begin {align*} 3\rangle & =\frac {|3^{\prime }\rangle }{\left \Vert 3^{\prime }\right \Vert }\\ & =\begin {bmatrix} 0\\ -\frac {2}{5}\\ \frac {1}{5}\end {bmatrix} \frac {1}{\sqrt {\frac {4}{25}+\frac {1}{25}}}\\ & =\begin {bmatrix} 0\\ -\frac {2\sqrt {5}}{5}\\ \frac {\sqrt {5}}{5}\end {bmatrix} \\ & =\begin {bmatrix} 0\\ -\frac {2}{\sqrt {5}}\\ \frac {1}{\sqrt {5}}\end {bmatrix} \end {align*}

Therefore the orthonormal basis are \[ |1\rangle =\begin {bmatrix} 1\\ 0\\ 0 \end {bmatrix} \qquad |2\rangle =\begin {bmatrix} 0\\ \frac {1}{\sqrt {5}}\\ \frac {2}{\sqrt {5}}\end {bmatrix} \qquad |3\rangle =\begin {bmatrix} 0\\ -\frac {2}{\sqrt {5}}\\ \frac {1}{\sqrt {5}}\end {bmatrix} \]

4.5.5 Problem 2

Use \(\operatorname {Tr}\sigma _{i}=0,\sigma _{i}^{2}=I\) and \(\sigma _{i}\sigma _{j}=i\sum _{k}\epsilon _{ijk}\sigma _{k}\) to obtain the components of a general \(2\times 2\) matrix in the basis of \(\left \{ \sigma _{1},\sigma _{2},\sigma _{3},I\right \} \), where \(\sigma _{i}\) represents the Pauli matrixes and \(I\) is the identity matrix.

Solution

The Pauli matrices are\[ \sigma _{1}=\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} \qquad \sigma _{2}=\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} \qquad \sigma _{3}=\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} \] And\[ \sigma _{i}\sigma _{j}=\left \{ \begin {array} [c]{ccc}i\sum _{k}\epsilon _{ijk}\sigma _{k} & & i\neq j\\ I & & i=j \end {array} \right . \] We are given basis \(\left \{ \sigma _{1},\sigma _{2},\sigma _{3},I\right \} \) to use to express general \(2\times 2\) with. This implies that, we want\begin {align} \begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =c_{1}\sigma _{1}+c_{2}\sigma _{2}+c_{3}\sigma _{3}+c_{4}I\nonumber \\ & =c_{1}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +c_{2}\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +c_{3}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +c_{4}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} \tag {1} \end {align}

Where \(c_{i}\) are weights to be found and \(\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} \) is any general matrix.

Taking the trace of the LHS and RHS of (1) gives\begin {align*} \operatorname {Tr}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =\operatorname {Tr}\left (c_{1}\sigma _{1}\right ) +\operatorname {Tr}\left ( c_{2}\sigma _{2}\right ) +\operatorname {Tr}\left (c_{3}\sigma _{3}\right ) +\operatorname {Tr}\left (c_{4}I\right ) \\ A_{11}+A_{22} & =c_{1}\operatorname {Tr}\left (\sigma _{1}\right ) +c_{2}\operatorname {Tr}\left (\sigma _{2}\right ) +c_{3}\operatorname {Tr}\left (\sigma _{3}\right ) +c_{4}\operatorname {Tr}\relax (I) \end {align*}

But \(\operatorname {Tr}\left (\sigma _{i}\right ) =0,i=1,2,3\) and \(\operatorname {Tr}\relax (I) =2\). The above becomes\begin {align} A_{11}+A_{22} & =2c_{4}\nonumber \\ c_{4} & =\frac {A_{11}+A_{22}}{2} \tag {2} \end {align}

We have found one of the weights. Now we need to find the remaining.

Pre multiplying both sides of (1) by \(\sigma _{1}\) gives\[ \sigma _{1}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} =c_{1}\sigma _{1}^{2}+c_{2}\sigma _{1}\sigma _{2}+c_{3}\sigma _{1}\sigma _{3}+c_{4}\sigma _{1}I \] But from properties of Pauli matrix, \(\sigma _{1}^{2}=I\) and \(\sigma _{1}\sigma _{2}=i\sum _{k}\epsilon _{12k}\sigma _{k}=i\left ( \overset {0}{\overbrace {\epsilon _{121}}}\sigma _{1}+\overset {0}{\overbrace {\epsilon _{122}}}\sigma _{2}+\overset {+1}{\overbrace {\epsilon _{123}}}\sigma _{3}\right ) =i\sigma _{3}\) and \(\sigma _{1}\sigma _{3}=i\sum _{k}\epsilon _{13k}\sigma _{k}=i\left ( \overset {0}{\overbrace {\epsilon _{131}}}\sigma _{1}+\overset {-1}{\overbrace {\epsilon _{132}}}\sigma _{2}+\overset {0}{\overbrace {\epsilon _{133}}}\sigma _{3}\right ) =-i\sigma _{2}\) and \(\sigma _{1}I=\sigma _{1}\), Hence the above becomes\begin {align} \begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =c_{1}I+ic_{2}\sigma _{3}-ic_{3}\sigma _{2}+c_{4}\sigma _{1}\nonumber \\\begin {bmatrix} A_{21} & A_{22}\\ A_{11} & A_{12}\end {bmatrix} & =\begin {bmatrix} c_{1} & 0\\ 0 & c_{1}\end {bmatrix} +ic_{2}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} -ic_{3}\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +c_{4}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} \nonumber \end {align}

Taking the trace again of both sides gives\begin {align} A_{21}+A_{12} & =2c_{1}\nonumber \\ c_{1} & =\frac {A_{21}+A_{12}}{2} \tag {3} \end {align}

We now repeat the above process.

Pre multiplying both sides of (1) by \(\sigma _{2}\) gives\[ \sigma _{2}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} =c_{1}\sigma _{2}\sigma _{1}+c_{2}\sigma _{2}^{2}+c_{3}\sigma _{2}\sigma _{3}+c_{4}\sigma _{2}I \] But from properties of Pauli matrix, \(\sigma _{2}^{2}=I\) and \(\sigma _{2}\sigma _{1}=i\sum _{k}\epsilon _{21k}\sigma _{k}=i\left ( \overset {0}{\overbrace {\epsilon _{211}}}\sigma _{1}+\overset {0}{\overbrace {\epsilon _{212}}}\sigma _{2}+\overset {-1}{\overbrace {\epsilon _{213}}}\sigma _{3}\right ) =-i\sigma _{3}\) and \(\sigma _{2}\sigma _{3}=i\sum _{k}\epsilon _{23k}\sigma _{k}=i\left ( \overset {-1}{\overbrace {\epsilon _{231}}}\sigma _{1}+\overset {0}{\overbrace {\epsilon _{232}}}\sigma _{2}+\overset {0}{\overbrace {\epsilon _{233}}}\sigma _{3}\right ) =-\sigma _{1}\) and \(\sigma _{4}I=\sigma _{4}\), Hence the above becomes\begin {align*} \begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =-c_{1}i\sigma _{3}+c_{2}I-c_{3}\sigma _{1}+c_{4}\sigma _{2}\\\begin {bmatrix} -iA_{21} & -iA_{22}\\ iA_{11} & iA_{12}\end {bmatrix} & =-c_{1}i\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +c_{2}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} -c_{3}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +c_{4}\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} \end {align*}

Taking the trace of both sides of the above gives\begin {align} -iA_{21}+iA_{12} & =2c_{2}\nonumber \\ c_{2} & =i\left (\frac {A_{12}-A_{21}}{2}\right ) \tag {4} \end {align}

And finally, we repeat one more time to find final coefficient \(c_{3}\).

Pre multiplying both sides of (1) by \(\sigma _{3}\) gives

\[ \sigma _{3}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} =c_{1}\sigma _{3}\sigma _{1}+c_{2}\sigma _{3}\sigma _{2}+c_{3}\sigma _{3}^{2}+c_{4}\sigma _{3}I \] But from properties of Pauli matrix, \(\sigma _{3}^{2}=I\) and \(\sigma _{3}\sigma _{1}=i\sum _{k}\epsilon _{31k}\sigma _{k}=i\left ( \overset {0}{\overbrace {\epsilon _{311}}}\sigma _{1}+\overset {-1}{\overbrace {\epsilon _{312}}}\sigma _{2}+\overset {0}{\overbrace {\epsilon _{313}}}\sigma _{3}\right ) =i\sigma _{2}\) and \(\sigma _{3}\sigma _{2}=i\sum _{k}\epsilon _{32k}\sigma _{k}=i\left ( \overset {-1}{\overbrace {\epsilon _{321}}}\sigma _{1}+\overset {0}{\overbrace {\epsilon _{322}}}\sigma _{2}+\overset {0}{\overbrace {\epsilon _{323}}}\sigma _{3}\right ) =-i\sigma _{1}\) and \(\sigma _{3}I=\sigma _{3}\), Hence the above becomes\begin {align*} \begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix}\begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =c_{1}i\sigma _{2}-ic_{2}\sigma _{1}+c_{3}I+c_{4}\sigma _{3}I\\\begin {bmatrix} A_{11} & -A_{22}\\ -A_{21} & -A_{22}\end {bmatrix} & =c_{1}i\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} -ic_{2}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +c_{3}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} +c_{4}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} \end {align*}

Taking the trace of both sides of the above gives\begin {align} A_{11}-A_{22} & =2c_{3}\nonumber \\ c_{3} & =\frac {A_{11}-A_{22}}{2} \tag {5} \end {align}

Hence the weights are from Eq. (2,3,4,5) are\begin {align*} c_{1} & =\frac {A_{21}+A_{12}}{2}\\ c_{2} & =\frac {i}{2}\left (A_{12}-A_{21}\right ) \\ c_{3} & =\frac {A_{11}-A_{22}}{2}\\ c_{4} & =\frac {A_{11}+A_{22}}{2} \end {align*}

Therefore we can now write any \(A\) matrix as\begin {align} \begin {bmatrix} A_{11} & A_{12}\\ A_{21} & A_{22}\end {bmatrix} & =c_{1}\sigma _{1}+c_{2}\sigma _{2}+c_{3}\sigma _{3}+c_{4}I\nonumber \\ & =c_{1}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +c_{2}\begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +c_{3}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +c_{4}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} \nonumber \\ & =\frac {A_{21}+A_{12}}{2}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +\frac {i}{2}\left (A_{12}-A_{21}\right ) \begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +\frac {A_{11}-A_{22}}{2}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +\frac {A_{11}+A_{22}}{2}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} \tag {8} \end {align}

Verification

As an example, let us try the above on some random matrix \(A\) say\[ A=\begin {bmatrix} 1 & 2i\\ 5 & 99 \end {bmatrix} \] Using (8) gives\[ A=\frac {A_{21}+A_{12}}{2}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +\frac {i}{2}\left (A_{12}-A_{21}\right ) \begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +\frac {A_{11}-A_{22}}{2}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +\frac {A_{11}+A_{22}}{2}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} \] But \(A_{11}=1,A_{12}=2i,A_{21}=2,A_{22}=99\). Hence the above becomes\begin {align*} A & =\frac {2+2i}{2}\begin {bmatrix} 0 & 1\\ 1 & 0 \end {bmatrix} +\frac {i}{2}\left (2i-2\right ) \begin {bmatrix} 0 & -i\\ i & 0 \end {bmatrix} +\frac {1-99}{2}\begin {bmatrix} 1 & 0\\ 0 & -1 \end {bmatrix} +\frac {1+99}{2}\begin {bmatrix} 1 & 0\\ 0 & 1 \end {bmatrix} \\ & =\begin {bmatrix} 0 & \frac {2+2i}{2}\\ \frac {2+2i}{2} & 0 \end {bmatrix} +\begin {bmatrix} 0 & -i\left (\frac {i}{2}\left (2i-2\right ) \right ) \\ i\left (\frac {i}{2}\left (2i-2\right ) \right ) & 0 \end {bmatrix} +\begin {bmatrix} \frac {-98}{2} & 0\\ 0 & \frac {98}{2}\end {bmatrix} +\begin {bmatrix} 50 & 0\\ 0 & 50 \end {bmatrix} \\ & =\begin {bmatrix} \frac {-98}{2}+50 & \frac {2+2i}{2}-i\left (\frac {i}{2}\left (2i-2\right ) \right ) \\ \frac {2+2i}{2}+i\left (\frac {i}{2}\left (2i-2\right ) \right ) & \frac {98}{2}+50 \end {bmatrix} \\ & =\begin {bmatrix} 1 & 2i\\ 2 & 99 \end {bmatrix} \end {align*}

Which is the correct \(A\) matrix.

4.5.6 Problem 9.2.5

Prove the triangle inequality starting with \(\left \Vert V+W\right \Vert ^{2}\). You must use \(\operatorname {Re}\langle V|W\rangle \leq |\ \langle V|W\rangle \ |\) and the Schwarz inequality. Show that the final inequality becomes an equality only if \(|V\rangle =a|W\rangle \) where \(a\) is real positive scalar.

Solution

Note: I am using \(\left \Vert V\right \Vert \) to mean the norm or magnitude of a Vector and \(|a|\) for absolute value.

The Schwarz inequality is given in 9.2.44 as\begin {equation} |\ \langle V|W\rangle \ |\ \leq \left \Vert V\right \Vert \left \Vert W\right \Vert \tag {9.2.44} \end {equation} The triangle inequality we need to prove is given in (9.2.45)\begin {equation} \left \Vert V+W\right \Vert \leq \left \Vert V\right \Vert +\left \Vert W\right \Vert \tag {9.2.44} \end {equation} Starting with \begin {align*} \left \Vert V+W\right \Vert ^{2} & =\langle \left (V+W\right ) |\left ( V+W\right ) \rangle \\ & =\langle V|V\rangle +\langle V|W\rangle +\langle W|V\rangle +\langle W|W\rangle \\ & =\langle V|V\rangle +\langle V|W\rangle +\langle V|W\rangle ^{\ast }+\langle W|W\rangle \\ & =\left \Vert V\right \Vert ^{2}+2\operatorname {Re}\langle V|W\rangle +\left \Vert W\right \Vert ^{2} \end {align*}

Applying Schwarz inequality \(|\ \langle V|W\rangle \ |\ \leq \left \Vert V\right \Vert \left \Vert W\right \Vert \) to the above gives\[ \left \Vert V+W\right \Vert ^{2}\leq \left \Vert V\right \Vert ^{2}+2\left \Vert V\right \Vert \left \Vert W\right \Vert +\left \Vert W\right \Vert ^{2}\] Hence the above becomes\[ \left \Vert V+W\right \Vert ^{2}\leq \left (\left \Vert V\right \Vert +\left \Vert W\right \Vert \right ) ^{2}\] Which means the same as\[ \left \Vert V+W\right \Vert \leq \left \Vert V\right \Vert +\left \Vert W\right \Vert \] Which is the Schwarz inequality.

4.5.7 Problem 9.3.5

You have seen above the matrix \(R_{z}\) (9.3.19) that rotates by \(\frac {\pi }{2}\) about the \(z\) axis. Construct a matrix that rotates by an arbitrary angle about the \(z\) axis. Repeat for a rotation around the \(x\) axis by some other angle. Verify that each matrix is orthogonal. Take their products and verify that it is also orthogonal. Show in general that the product of two orthogonal matrices is orthogonal. (Remember the rule for the transpose of a product).

Solution

Equation 9.3.19 is\[ R_{z}\left (\frac {\pi }{2}\right ) =\begin {bmatrix} 0 & -1 & 0\\ 1 & 0 & 0\\ 0 & 0 & 1 \end {bmatrix} \] To construct rotation matrix \(\Omega \), we follow this guideline. \[ \Omega _{z}\left (\theta \right ) =\begin {bmatrix} \Omega _{11} & \Omega _{12} & \Omega _{13}\\ \Omega _{21} & \Omega _{22} & \Omega _{23}\\ \Omega _{31} & \Omega _{32} & \Omega _{33}\end {bmatrix} \] The first column of \(\Omega \) is the representation (components) of \(|1^{\prime }\rangle \) in terms of the original basis vectors \(|1\rangle ,\) \(|2\rangle ,\) \(|3\rangle \) before rotation.

Using normal notation, this is the same as saying first column gives the components of \(e_{x}^{\prime }\) in terms of unit original basis \(e_{x},e_{y},e_{z}\). The second column of \(\Omega \) is the components of \(|2^{\prime }\rangle \) in terms of the original basis vectors \(|1\rangle ,\) \(|2\rangle ,\) \(|3\rangle \) and third column is components of \(|3^{\prime }\rangle \) in terms of the original basis vectors \(|1\rangle ,\) \(|2\rangle ,\) \(|3\rangle \).

The representation is found using dot product. For example, first column of \(\Omega \) is\begin {align*} \Omega _{11} & =\langle 1|1^{\prime }\rangle \\ \Omega _{21} & =\langle 2|1^{\prime }\rangle \\ \Omega _{31} & =\langle 3|1^{\prime }\rangle \end {align*}

And so on for the rest of the columns.  For an angle \(\theta \), a diagram helps to see the representation. Since the dot product is the projection of \(|1^{\prime }\rangle \) on the original basis. In other words \(\langle 1|1^{\prime }\rangle \) is the projection of \(|1^{\prime }\rangle \) on \(|1\rangle \) and \(\langle 2|1^{\prime }\rangle \) is the projection of \(|1^{\prime }\rangle \) on \(|2\rangle \) and so on. So we can read the components directly from the diagram.

pict
Figure 4.11:Rotation around \(z\) by arbitrary angle \(\theta \)

We see from the diagram that\begin {align*} \langle 1|1^{\prime }\rangle & =\left \Vert 1\right \Vert \left \Vert 1^{\prime }\right \Vert \cos \theta \\ & =\cos \theta \end {align*}

Since basis vectors have norm of \(1\). And \[ \langle 2|1^{\prime }\rangle =\left \Vert 2\right \Vert \left \Vert 1^{\prime }\right \Vert \sin \theta \] and \(\langle 3|1^{\prime }\rangle =0\) since the projection of \(|1^{\prime }\rangle \) on \(|3\rangle \) is zero, since rotation is around \(z\) axis, hence vectors on \(xy\) plane remain in the \(xy\) plane.  The above gives us the first column of \(\Omega \). So now we have\[ \Omega _{z}\left (\theta \right ) =\begin {bmatrix} \cos \theta & \Omega _{12} & \Omega _{13}\\ \sin \theta & \Omega _{22} & \Omega _{23}\\ 0 & \Omega _{32} & \Omega _{33}\end {bmatrix} \] The second column of \(\Omega \) are the projections of \(|2^{\prime }\rangle \) on \(|1\rangle ,|2\rangle ,|3\rangle \) which are\begin {align*} \langle 1|2^{\prime }\rangle & =\left \Vert 1\right \Vert \left \Vert 2^{\prime }\right \Vert \sin \theta \\ & =\sin \theta \end {align*}

But this is in the direction of negative \(1\rangle \) so we need to add a negative sign. Hence \(\langle 1|2^{\prime }\rangle =-\sin \theta \). \begin {align*} \langle 2|2^{\prime }\rangle & =\left \Vert 2\right \Vert \left \Vert 2^{\prime }\right \Vert \cos \theta \\ & =\cos \theta \end {align*}

and \(\langle 3|2^{\prime }\rangle =0\) since rotation in only in the \(xy\) plane. For the third column, we see that \(3^{\prime }\rangle \) remains the same as original \(3\rangle \). Hence no change here. Therefore \[ \Omega _{z}\left (\theta \right ) =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix} \] We now do the rotation around \(x\) axis to find \(\Omega _{x}\left (\phi \right ) \).

pict
Figure 4.12:Rotation around \(x\) by arbitrary angle \(\pi \)

We see from the diagram that\[ 1^{\prime }\rangle =1\rangle \] And\[ 2^{\prime }\rangle =\left (\cos \phi \right ) 2\rangle +\left (\sin \phi \right ) 3\rangle \] And\[ 3^{\prime }\rangle =-\left (\sin \phi \right ) 2\rangle +\left (\cos \phi \right ) 3\rangle \] Therefore\[ \Omega _{x}\left (\phi \right ) =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & -\sin \phi \\ 0 & \sin \phi & \cos \phi \end {bmatrix} \] Where the first column of the above matrix, is the components of \(1^{\prime }\rangle \) expressed in terms of \(1\rangle ,2\rangle ,3\rangle \) and the second column is the components of \(2^{\prime }\rangle \) expressed in terms of \(1\rangle ,2\rangle ,3\rangle \) and third column is the components of \(3^{\prime }\rangle \) expressed in terms of \(1\rangle ,2\rangle ,3\rangle \).

Now we need to verify that \(\Omega _{z}\left (\theta \right ) \) and \(\Omega _{x}\left (\phi \right ) \) are orthogonal. What this means is that each column of the matrix is orthogonal to each other column in the same matrix. One to way to do that is to multiply the matrix by its transpose. If we get the identity matrix as a result, then the matrix is orthogonal.

Verify \(\Omega _{z}\left (\theta \right ) \) is orthogonal\begin {align*} \Omega _{z}\left (\theta \right ) \Omega _{z}^{T}\left (\theta \right ) & =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix}\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix} ^{T}\\ & =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix}\begin {bmatrix} \cos \theta & \sin \theta & 0\\ -\sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix} \\ & =\begin {bmatrix} \cos ^{2}\theta +\sin ^{2}\theta & \cos \theta \sin \theta -\sin \theta \cos \theta & 0\\ \sin \theta \cos \theta -\cos \theta \sin \theta & \sin ^{2}\theta +\cos ^{2}\theta & 0\\ 0 & 0 & 1 \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \end {align*}

Verified.

Verify \(\Omega _{x}\left (\phi \right ) \) is orthogonal\begin {align*} \Omega _{x}\left (\phi \right ) \Omega _{x}^{T}\left (\phi \right ) & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & -\sin \phi \\ 0 & \sin \phi & \cos \phi \end {bmatrix}\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & -\sin \phi \\ 0 & \sin \phi & \cos \phi \end {bmatrix} ^{T}\\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & -\sin \phi \\ 0 & \sin \phi & \cos \phi \end {bmatrix}\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & \sin \phi \\ 0 & -\sin \phi & \cos \phi \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos ^{2}\phi +\sin ^{2}\phi & \cos \phi \sin \phi -\sin \phi \cos \phi \\ 0 & \sin \phi \cos \phi -\cos \phi \sin \phi & \sin ^{2}\phi +\cos ^{2}\phi \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \end {align*}

Verified.

The product is\begin {align*} \Omega _{x}\left (\phi \right ) \Omega _{z}\left (\phi \right ) & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos \phi & -\sin \phi \\ 0 & \sin \phi & \cos \phi \end {bmatrix}\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \sin \theta & \cos \theta & 0\\ 0 & 0 & 1 \end {bmatrix} \\ & =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \cos \phi \sin \theta & \cos \theta \cos \phi & -\sin \phi \\ \sin \phi \sin \theta & \cos \theta \sin \phi & \cos \phi \end {bmatrix} \end {align*}

To show that the is also orthogonal, then, using \(\Delta =\left (\Omega _{x}\left (\phi \right ) \Omega _{z}\left (\phi \right ) \right ) \left ( \Omega _{x}\left (\phi \right ) \Omega _{z}\left (\phi \right ) \right ) ^{T}\) then

\begin {align*} \Delta & =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \cos \phi \sin \theta & \cos \theta \cos \phi & -\sin \phi \\ \sin \phi \sin \theta & \cos \theta \sin \phi & \cos \phi \end {bmatrix}\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \cos \phi \sin \theta & \cos \theta \cos \phi & -\sin \phi \\ \sin \phi \sin \theta & \cos \theta \sin \phi & \cos \phi \end {bmatrix} ^{T}\\ & =\begin {bmatrix} \cos \theta & -\sin \theta & 0\\ \cos \phi \sin \theta & \cos \theta \cos \phi & -\sin \phi \\ \sin \phi \sin \theta & \cos \theta \sin \phi & \cos \phi \end {bmatrix}\begin {bmatrix} \cos \theta & \cos \phi \sin \theta & \sin \phi \sin \theta \\ -\sin \theta & \cos \theta \cos \phi & \cos \theta \sin \phi \\ 0 & -\sin \phi & \cos \phi \end {bmatrix} \end {align*}

Expanding gives

\[ \Delta =\begin {bmatrix} \cos ^{2}\theta +\sin ^{2}\theta & \cos \theta \cos \phi \sin \theta -\sin \theta \cos \theta \cos \phi & \cos \theta \sin \phi \sin \theta -\sin \theta \cos \theta \sin \phi \\ \cos \phi \sin \theta \cos \theta -\sin \theta \cos \theta \cos \phi & \cos ^{2}\phi \sin ^{2}\theta +\cos ^{2}\theta \cos ^{2}\phi +\sin ^{2}\phi & \cos \phi \sin ^{2}\theta \sin \phi +\cos ^{2}\theta \cos \phi \sin \phi -\sin \phi \cos \phi \\ \sin \phi \sin \theta \cos \theta -\sin \theta \cos \theta \sin \phi & \sin \phi \sin ^{2}\theta \cos \phi +\cos ^{2}\theta \sin \phi \cos \phi -\cos \phi \sin \phi & \sin ^{2}\phi \sin ^{2}\theta +\cos ^{2}\theta \sin ^{2}\phi +\cos ^{2}\phi \end {bmatrix} \]

Simplifying

\begin {align*} \Delta & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos ^{2}\phi \left (\sin ^{2}\theta +\cos ^{2}\theta \right ) +\sin ^{2}\phi & \cos \phi \sin ^{2}\theta \sin \phi +\cos ^{2}\theta \cos \phi \sin \phi -\sin \phi \cos \phi \\ 0 & \sin \phi \sin ^{2}\theta \cos \phi +\cos ^{2}\theta \sin \phi \cos \phi -\cos \phi \sin \phi & \sin ^{2}\phi \left (\sin ^{2}\theta +\cos ^{2}\theta \right ) +\cos ^{2}\phi \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos ^{2}\phi +\sin ^{2}\phi & \cos \phi \sin \phi \left (\sin ^{2}\theta +\cos ^{2}\theta \right ) -\sin \phi \cos \phi \\ 0 & \sin \phi \cos \phi \left (\sin ^{2}\theta +\cos ^{2}\theta \right ) -\cos \phi \sin \phi & \sin ^{2}\phi \left (\sin ^{2}\theta +\cos ^{2}\theta \right ) +\cos ^{2}\phi \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & \cos ^{2}\phi +\sin ^{2}\phi & \cos \phi \sin \phi -\sin \phi \cos \phi \\ 0 & \sin \phi \cos \phi -\cos \phi \sin \phi & \sin ^{2}\phi +\cos ^{2}\phi \end {bmatrix} \\ & =\begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \end {align*}

Since the result is identity matrix, then the product \(\Omega _{x}\left ( \phi \right ) \Omega _{z}\left (\phi \right ) \) is an orthogonal matrix.

Now need to show in general that the product of two orthogonal matrices is orthogonal. Let \(A,B\) be both orthogonal. Hence \(AA^{T}=I\) and \(BB^{T}=I\). Now\begin {align*} \left (AB\right ) \left (AB\right ) ^{T} & =\left (AB\right ) \left ( B^{T}A^{T}\right ) \\ & =ABB^{T}A^{T} \end {align*}

But \(BB^{T}=I\). Therefore\begin {align*} \left (AB\right ) \left (AB\right ) ^{T} & =AIA^{T}\\ & =AA^{T} \end {align*}

But also \(AA^{T}=I\). Therefore\[ \left (AB\right ) \left (AB\right ) ^{T}=I \] Therefore \(AB\) is orthogonal. QED.

4.5.8 Problem 9.5.6

   4.5.8.1 Appendix

The Cayley-Hamilton theorem states that every matrix obeys its characteristic equation. In other words, if \(P\left (\omega \right ) \) is the characteristic polynomial for the matrix \(\Omega \), then \(P\left (\Omega \right ) \) vanishes as a matrix. This means that it will annihilate any vector. First prove the theorem for a Hermitian \(\Omega \) with nondegenerate eigenvectors by starting with the action of \(P\left (\Omega \right ) \) on the eigenvectors.

(Verified from the instructor that the above is the only part required to prove).

Solution

A matrix \(\Omega \) with nondegenerate eigenvector is diagonalizable. This is by definition, as it implies that for the matrix with \(n\) eigenvalues, it is possible to find \(n\) orthonormal eigenvectors associated with the eigenvalues. What this means is that we can write \[ \Omega =RDR^{-1}\] Where \(R\) is \(n\times n\) matrix, whose columns are the \(n\) eigenvectors of \(\Omega \) and \(D\) is a diagonal matrix which has the corresponding eigenvalues \(\omega _{1},\omega _{2},\cdots ,\omega _{n}\) on the diagonal of \(D\). Since \(P\left (\Omega \right ) \) is polynomial in \(\Omega \), then we can write\begin {align} P\left (\Omega \right ) & =\sum _{k=0}^{n}a_{k}\Omega ^{k}\nonumber \\ & =\sum _{k=0}^{n}a_{k}\left (RDR^{-1}\right ) ^{k} \tag {1} \end {align}

But \[ \left (RDR^{-1}\right ) ^{k}=RD^{k}R^{-1}\] To show the above, consider \(\left (RDR^{-1}\right ) ^{2}=\left ( RDR^{-1}\right ) \left (RDR^{-1}\right ) =RD\overset {I}{\overbrace {R^{-1}R}}DR^{-1}=RD^{2}R^{-1}\) and similarly for any higher powers. Eq. (1) now becomes\begin {align*} P\left (\Omega \right ) & =\sum _{k=0}^{n}a_{k}RD^{k}R^{-1}\\ & =R\left (\sum _{k=0}^{n}a_{k}D^{k}\right ) R^{-1} \end {align*}

But \(\sum _{k=0}^{n}a_{k}D^{k}=P\relax (D) \), which means applying operator on \(D\) only. Hence the above becomes\begin {equation} P\left (\Omega \right ) =R\ P\relax (D) \ R^{-1} \tag {2} \end {equation} But since \(D\) is a diagonal matrix, having the structure \(D=\begin {bmatrix} \omega _{1} & 0 & 0 & 0\\ 0 & \omega _{2} & 0 & 0\\ 0 & 0 & \ddots & 0\\ 0 & 0 & 0 & \omega _{n}\end {bmatrix} \), then \(P\relax (D) =\begin {bmatrix} P\left (\omega _{1}\right ) & 0 & 0 & 0\\ 0 & p\left (\omega _{2}\right ) & 0 & 0\\ 0 & 0 & \ddots & 0\\ 0 & 0 & 0 & p\left (\omega _{n}\right ) \end {bmatrix} \). Eq (2) now becomes\[ P\left (\Omega \right ) =R\begin {bmatrix} P\left (\omega _{1}\right ) & 0 & 0 & 0\\ 0 & p\left (\omega _{2}\right ) & 0 & 0\\ 0 & 0 & \ddots & 0\\ 0 & 0 & 0 & p\left (\omega _{n}\right ) \end {bmatrix} R^{-1}\] But \(P\left (\omega _{1}\right ) =p\left (\omega _{2}\right ) =\cdots =p\left ( \omega _{n}\right ) =0\), since each \(\omega _{i}\) is a root of the characteristic polynomial of matrix \(\Omega \). Therefore the above reduces to \begin {align*} P\left (\Omega \right ) & =R\begin {bmatrix} 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\\ 0 & 0 & \ddots & 0\\ 0 & 0 & 0 & 0 \end {bmatrix} R^{-1}\\ & =\begin {bmatrix} 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0\\ 0 & 0 & \ddots & 0\\ 0 & 0 & 0 & 0 \end {bmatrix} \end {align*}

This proves the Cayley-Hamilton for the case of \(\Omega \) with nondegenerate eigenvectors, which is what we are asked to show.

4.5.8.1 Appendix

(We are not asked to do the matrix inverse part only, but I did it for practice. Not for grading).

Show that \(\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} ^{-1}=\begin {bmatrix} 1 & -\frac {11}{8} & -\frac {1}{4}\\ 0 & \frac {1}{2} & 0\\ 0 & -\frac {1}{8} & \frac {1}{4}\end {bmatrix} \) by using Cayley-Hamilton theorem. Also show that \(\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 4 & 1 \end {bmatrix} ^{-1}=\begin {bmatrix} 1 & \frac {1}{2} & -1\\ 0 & \frac {1}{2} & 0\\ 0 & -2 & 1 \end {bmatrix} \).

Solution

Cayley-Hamilton theorem says that a matrix \(\Omega \) obeys its characteristic equation. In other words\begin {align*} P\left (\Omega \right ) & =0\\ a_{n}\Omega ^{n}+a_{n-1}\Omega ^{n-1}+\cdots +a_{1}\Omega +a_{0} & =0 \end {align*}

Multiplying both sides of the above by the inverse \(\Omega ^{-1}\,\) gives\begin {align} a_{n}\Omega ^{n-1}+a_{n-1}\Omega ^{n-2}+\cdots +a_{1}+a_{0}\Omega ^{-1} & =0\nonumber \\ \Omega ^{-1} & =\frac {a_{n}\Omega ^{n-1}+a_{n-1}\Omega ^{n-2}+\cdots +a_{1}}{a_{0}} \tag {1} \end {align}

We now apply the above to the first matrix. For \(\Omega =\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} \), we first need to find the characteristic equation.\begin {align*} \det \left ( \begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} -\lambda \begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \right ) & =0\\\begin {vmatrix} 1-\lambda & 3 & 1\\ 0 & 2-\lambda & 0\\ 0 & 1 & 4-\lambda \end {vmatrix} & =0\\ \left (1-\lambda \right ) \begin {vmatrix} 2-\lambda & 0\\ 1 & 4-\lambda \end {vmatrix} -3\begin {vmatrix} 0 & 0\\ 0 & 4-\lambda \end {vmatrix} +\begin {vmatrix} 0 & 2-\lambda \\ 0 & 1 \end {vmatrix} & =0\\ \left (1-\lambda \right ) \left (\left (2-\lambda \right ) \left ( 4-\lambda \right ) \right ) & =0\\ -\lambda ^{3}+7\lambda ^{2}-14\lambda +8 & =0\\ \lambda ^{3}-7\lambda ^{2}+14\lambda -8 & =0 \end {align*}

Therefore, using Cayley-Hamilton, the above becomes\[ \Omega ^{3}-7\Omega ^{2}+14\Omega -8=0 \] Where now \(\Omega \) is the matrix itself. Multiplying both sides by \(\Omega ^{-1}\) gives\begin {align} \Omega ^{2}-7\Omega +14I-8\Omega ^{-1} & =0\nonumber \\ -8\Omega ^{-1} & =-\Omega ^{2}+7\Omega -14I\nonumber \\ -\Omega ^{-1} & =\frac {1}{8}\left (-\Omega ^{2}+7\Omega -14I\right ) \nonumber \\ \Omega ^{-1} & =\frac {1}{8}\left (\Omega ^{2}-7\Omega +14I\right ) \tag {2} \end {align}

So to find matrix inverse \(\Omega ^{-1}\) we just need to calculate \(\Omega ^{2}\) and then simplify the result. But \begin {align*} \Omega ^{2} & =\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix}\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} \\ & =\begin {bmatrix} 1 & 10 & 5\\ 0 & 4 & 0\\ 0 & 6 & 16 \end {bmatrix} \end {align*}

Substituting the above in Eq. (2) gives\begin {align*} \Omega ^{-1} & =\frac {1}{8}\left ( \begin {bmatrix} 1 & 10 & 5\\ 0 & 4 & 0\\ 0 & 6 & 16 \end {bmatrix} -7\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} +14\begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \right ) \\ & =\frac {1}{8}\left ( \begin {bmatrix} 1 & 10 & 5\\ 0 & 4 & 0\\ 0 & 6 & 16 \end {bmatrix} -7\begin {bmatrix} 1 & 3 & 1\\ 0 & 2 & 0\\ 0 & 1 & 4 \end {bmatrix} +14\begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \right ) \\ & =\frac {1}{8}\begin {bmatrix} 8 & -11 & -2\\ 0 & 4 & 0\\ 0 & -1 & 2 \end {bmatrix} \\ & =\begin {bmatrix} 1 & -\frac {11}{8} & -\frac {1}{4}\\ 0 & \frac {1}{2} & 0\\ 0 & -\frac {1}{8} & \frac {1}{4}\end {bmatrix} \end {align*}

4.5.9 Problem 9.5.10

Show that the following matrices commute and find a common eigenbasis\[ M=\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix} \qquad N=\begin {bmatrix} 2 & 1 & 1\\ 1 & 0 & -1\\ 1 & -1 & 2 \end {bmatrix} \] Solution

The matrices commute if \(MN=NM\). But\begin {align*} MN & =\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix}\begin {bmatrix} 2 & 1 & 1\\ 1 & 0 & -1\\ 1 & -1 & 2 \end {bmatrix} \\ & =\begin {bmatrix} 3 & 0 & 3\\ 0 & 0 & 0\\ 3 & 0 & 3 \end {bmatrix} \end {align*}

And\begin {align*} NM & =\begin {bmatrix} 2 & 1 & 1\\ 1 & 0 & -1\\ 1 & -1 & 2 \end {bmatrix}\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix} \\ & =\begin {bmatrix} 3 & 0 & 3\\ 0 & 0 & 0\\ 3 & 0 & 3 \end {bmatrix} \end {align*}

We see that \(MN=NM\) therefore they commute.

Now we need to find the common eigenbasis. To do this, the eigenvalues and corresponding eigenvectors for \(M\) and \(N\) are now found.

We start with matrix \(M\).

To find eigenvalues for \(M\), we solve the equation\[ \det \left (M-\lambda I\right ) =0 \] Where \(\lambda \) represent the eigenvalues. The above becomes\begin {align*} \det \left ( \begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix} -\lambda \begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \right ) & =0\\\begin {vmatrix} 1-\lambda & 0 & 1\\ 0 & -\lambda & 0\\ 1 & 0 & 1-\lambda \end {vmatrix} & =0\\ \left (1-\lambda \right ) \begin {vmatrix} -\lambda & 0\\ 0 & 1-\lambda \end {vmatrix} +\begin {vmatrix} 0 & -\lambda \\ 1 & 0 \end {vmatrix} & =0\\ \left (1-\lambda \right ) \left (-\lambda \left (1-\lambda \right ) \right ) +\lambda & =0\\ 2\lambda ^{2}-\lambda ^{3} & =0\\ \lambda ^{2}\left (2-\lambda \right ) & =0 \end {align*}

Hence the roots (eigenvalues) are \(\lambda =0\) with multiplicity \(2\) and \(\lambda =2\). For each \(\lambda _{i}\) now we find the corresponding eigenvector \(|v_{i}\rangle \).

\(\lambda =2\)

We now need to solve  \(Mv=\lambda v\) for \(v\). This implies\begin {align*} \left (M-\lambda I\right ) v & =0\\\begin {bmatrix} 1-\lambda & 0 & 1\\ 0 & -\lambda & 0\\ 1 & 0 & 1-\lambda \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \end {align*}

But \(\lambda =2\) and the above becomes\[\begin {bmatrix} -1 & 0 & 1\\ 0 & -2 & 0\\ 1 & 0 & -1 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] \(R_{3}=R_{3}+R_{1}\)\[\begin {bmatrix} -1 & 0 & 1\\ 0 & -2 & 0\\ 0 & 0 & 0 \end {bmatrix} \] The system becomes\[\begin {bmatrix} -1 & 0 & 1\\ 0 & -2 & 0\\ 0 & 0 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] Since last row is zero, then we have one free variable \(v_{3}\) and two leading variables \(v_{1},v_{2}\). Let \(v_{3}=s\). Second row gives \(v_{2}=0\) and first row gives \(-v_{1}+s=0\) or \(v_{1}=s\). Hence the solution is \begin {align*} \begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} s\\ 0\\ s \end {bmatrix} \\ & =s\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \end {align*}

Since \(s\) is free variable, we can pick any non-zero value for it. Let \(s=1\) and the above becomes\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \] The above is the eigenvector that corresponds to \(\lambda =2\). Now we find the eigenvectors that correspond to \(\lambda =0\). Hopefully we will be able to find two of them.

\(\lambda =0\)

We now need to solve  \(Mv=\lambda v\) for \(v\). This implies\begin {align*} \left (M-\lambda I\right ) v & =0\\\begin {bmatrix} 1-\lambda & 0 & 1\\ 0 & -\lambda & 0\\ 1 & 0 & 1-\lambda \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \end {align*}

But \(\lambda =0\) and the above becomes\[\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] \(R_{3}=R_{3}-R_{1}\) gives\[\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 0 & 0 & 0 \end {bmatrix} \] Hence the system becomes\[\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 0 & 0 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] We see that \(v_{3},v_{2}\) are free variables and \(v_{1}\) is leading variables. Let \(v_{3}=s,v_{2}=t\). From first row, \(v_{1}+s=0\) or \(v_{1}=-s\). Therefore the solution is\begin {align*} \begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} -s\\ t\\ s \end {bmatrix} \\ & =s\begin {bmatrix} -1\\ 0\\ 1 \end {bmatrix} +t\begin {bmatrix} 0\\ 1\\ 0 \end {bmatrix} \end {align*}

Picking \(s=1,t=0\) gives one eigenvector as\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} -1\\ 0\\ 1 \end {bmatrix} \] Picking \(s=0,t=1\) gives second eigenvector as\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 1\\ 0 \end {bmatrix} \] So we were able to find two eigenvectors from one eigenvalue \(\lambda =0\), which is good. This table summarizes the result we have found so far for the matrix \(M\)




eigenvalue multiplicity corresponding eigenvector(s)



\(\lambda =2\) \(1\) \(\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \)



\(\lambda =0\) \(2\) \(\begin {bmatrix} -1\\ 0\\ 1 \end {bmatrix} ,\begin {bmatrix} 0\\ 1\\ 0 \end {bmatrix} \)



Now we normalized them. This gives




eigenvalue multiplicity corresponding normalized eigenvector(s)



\(\lambda =2\) \(1\) \(\frac {1}{\sqrt {2}}\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \)



\(\lambda =0\) \(2\) \(\frac {1}{\sqrt {2}}\begin {bmatrix} -1\\ 0\\ 1 \end {bmatrix} ,\begin {bmatrix} 0\\ 1\\ 0 \end {bmatrix} \)



For the matrix \(N\)

To find eigenvalues for \(M\), we solve the equation\[ \det \left (N-\lambda I\right ) =0 \] Where \(\lambda \) represent the eigenvalues. The above becomes\begin {align*} \det \left ( \begin {bmatrix} 2 & 1 & 1\\ 1 & 0 & -1\\ 1 & -1 & 2 \end {bmatrix} -\lambda \begin {bmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end {bmatrix} \right ) & =0\\\begin {vmatrix} 2-\lambda & 1 & 1\\ 1 & -\lambda & -1\\ 1 & -1 & 2-\lambda \end {vmatrix} & =0\\ \left (2-\lambda \right ) \begin {vmatrix} -\lambda & -1\\ -1 & 2-\lambda \end {vmatrix} -\begin {vmatrix} 1 & -1\\ 1 & 2-\lambda \end {vmatrix} +\begin {vmatrix} 1 & -\lambda \\ 1 & -1 \end {vmatrix} & =0\\ \left (2-\lambda \right ) \left (-\lambda \left (2-\lambda \right ) -1\right ) -\left (2-\lambda +1\right ) +\left (-1+\lambda \right ) & =0\\ -\lambda ^{3}+4\lambda ^{2}-\lambda -6 & =0\\ \lambda ^{3}-4\lambda ^{2}+\lambda +6 & =0 \end {align*}

Lets guess \(\lambda =-1\) is a root. Then the above becomes \(-1-4-1+6=0\). Good. So \(\left (\lambda +1\right ) \) is a factor. Doing long division \[ \frac {\lambda ^{3}-4\lambda ^{2}+\lambda +6}{\left (\lambda +1\right ) }=\lambda ^{2}-5\lambda +6 \] Therefore the polynomial becomes\begin {align*} \left (\lambda ^{2}-5\lambda +6\right ) \left (\lambda +1\right ) & =0\\ \left (\lambda -2\right ) \left (\lambda -3\right ) \left (\lambda +1\right ) & =0 \end {align*}

Hence the roots (eigenvalues) are \(\lambda =2,\lambda =3,\lambda =-1\). For each \(\lambda _{i}\) now we find the corresponding eigenvector \(|v_{i}\rangle \).

\(\lambda =2\)

We now need to solve  \(Nv=\lambda v\) for \(v\). This implies\begin {align*} \left (N-\lambda I\right ) v & =0\\\begin {bmatrix} 2-\lambda & 1 & 1\\ 1 & -\lambda & -1\\ 1 & -1 & 2-\lambda \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \end {align*}

But \(\lambda =2\) and the above becomes\[\begin {bmatrix} 0 & 1 & 1\\ 1 & -2 & -1\\ 1 & -1 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] Swapping \(R_{1}\) with \(R_{3}\) so that pivot is not zero gives\[\begin {bmatrix} 1 & -1 & 0\\ 1 & -2 & -1\\ 0 & 1 & 1 \end {bmatrix} \] \(R_{2}=R_{2}-R_{1}\)\[\begin {bmatrix} 1 & -1 & 0\\ 0 & -1 & -1\\ 0 & 1 & 1 \end {bmatrix} \] \(R_{3}=R_{3}+R_{2}\)\[\begin {bmatrix} 1 & -1 & 0\\ 0 & -1 & -1\\ 0 & 0 & 0 \end {bmatrix} \] Hence system becomes\[\begin {bmatrix} 1 & -1 & 0\\ 0 & -1 & -1\\ 0 & 0 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] Free variable is \(v_{3}\) and leading variables are \(v_{1},v_{2}\). Let \(v_{3}=s\). Second row gives \(-v_{2}-s=0\) or \(v_{2}=-s\). First row gives \(v_{1}-v_{2}=0\) or \(v_{1}=v_{2}=-s\). Hence solution is\begin {align*} \begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} -s\\ -s\\ s \end {bmatrix} \\ & =s\begin {bmatrix} -1\\ -1\\ 1 \end {bmatrix} \end {align*}

Let \(s=1\) therefore\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} -1\\ -1\\ 1 \end {bmatrix} \] \(\lambda =3\)

We now need to solve  \(Nv=\lambda v\) for \(v\). This implies\begin {align*} \left (N-\lambda I\right ) v & =0\\\begin {bmatrix} 2-\lambda & 1 & 1\\ 1 & -\lambda & -1\\ 1 & -1 & 2-\lambda \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \end {align*}

But \(\lambda =3\) and the above becomes\[\begin {bmatrix} -1 & 1 & 1\\ 1 & -3 & -1\\ 1 & -1 & -1 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] \(R_{2}=R_{2}+R_{1}\)\[\begin {bmatrix} -1 & 1 & 1\\ 0 & -2 & 0\\ 1 & -1 & -1 \end {bmatrix} \] \(R_{3}=R_{3}+R_{1}\)\[\begin {bmatrix} -1 & 1 & 1\\ 0 & -2 & 0\\ 0 & 0 & 0 \end {bmatrix} \] Hence system becomes\[\begin {bmatrix} -1 & 1 & 1\\ 0 & -2 & 0\\ 0 & 0 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] \(v_{3}\) is free variable and \(v_{1},v_{2}\) are leading variables. Let \(v_{3}=s\). Second row gives \(-2v_{2}=0\) or \(v_{2}=0\). First row gives \(-v_{1}+s=0\) or \(v_{1}=s\). Solution is \begin {align*} \begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} s\\ 0\\ s \end {bmatrix} \\ & =s\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \end {align*}

Let \(s=1\). The solution becomes\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \] \(\lambda =-1\)

We now need to solve  \(Nv=\lambda v\) for \(v\). This implies\begin {align*} \left (N-\lambda I\right ) v & =0\\\begin {bmatrix} 2-\lambda & 1 & 1\\ 1 & -\lambda & -1\\ 1 & -1 & 2-\lambda \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \end {align*}

But \(\lambda =-1\) and the above becomes\[\begin {bmatrix} 3 & 1 & 1\\ 1 & 1 & -1\\ 1 & -1 & 3 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] Swapping \(R_{2}\) and \(R_{1}\) to keep pivot \(1\) gives\[\begin {bmatrix} 1 & 1 & -1\\ 3 & 1 & 1\\ 1 & -1 & 3 \end {bmatrix} \] \(R_{2}=R_{2}-3R_{1}\)\[\begin {bmatrix} 1 & 1 & -1\\ 0 & -2 & 4\\ 1 & -1 & 3 \end {bmatrix} \] \(R_{3}=R_{3}-R_{1}\)\[\begin {bmatrix} 1 & 1 & -1\\ 0 & -2 & 4\\ 0 & -2 & 4 \end {bmatrix} \] \(R_{3}=R_{3}-R_{2}\)\[\begin {bmatrix} 1 & 1 & -1\\ 0 & -2 & 4\\ 0 & 0 & 0 \end {bmatrix} \] Hence system becomes\[\begin {bmatrix} 1 & 1 & -1\\ 0 & -2 & 4\\ 0 & 0 & 0 \end {bmatrix}\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} 0\\ 0\\ 0 \end {bmatrix} \] \(v_{3}\) is free variable and \(v_{1},v_{2}\) are leading variables. Let \(v_{3}=s\). Second row gives \(-2v_{2}+4s=0\) or \(v_{2}=2s\). First row gives \(v_{1}+v_{2}-s=0\) or \(v_{1}=-v_{2}+s=-2s+s=-s\). Solution is \begin {align*} \begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} & =\begin {bmatrix} -s\\ 2s\\ s \end {bmatrix} \\ & =s\begin {bmatrix} -1\\ 2\\ 1 \end {bmatrix} \end {align*}

Let \(s=1\), the solution becomes\[\begin {bmatrix} v_{1}\\ v_{2}\\ v_{3}\end {bmatrix} =\begin {bmatrix} -1\\ 2\\ 1 \end {bmatrix} \] This table summarizes the result we have found so far for the matrix \(N\)




eigenvalue multiplicity corresponding eigenvector(s)



\(\lambda =2\) \(1\) \(\begin {bmatrix} -1\\ -1\\ 1 \end {bmatrix} \)



\(\lambda =3\) \(1\) \(\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \)



\(\lambda =-1\) \(1\) \(\begin {bmatrix} -1\\ 2\\ 1 \end {bmatrix} \)



Now we normalized them. This gives




eigenvalue multiplicity corresponding normalized eigenvector(s)



\(\lambda =2\) \(1\) \(\frac {1}{\sqrt {3}}\begin {bmatrix} -1\\ -1\\ 1 \end {bmatrix} \)



\(\lambda =3\) \(1\) \(\frac {1}{\sqrt {2}}\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} \)



\(\lambda =-1\) \(1\) \(\frac {1}{\sqrt {6}}\begin {bmatrix} -1\\ 2\\ 1 \end {bmatrix} \)



Now we compare the eigenbasis for \(M\) and \(N\). This table shows the final result




Operator eigenvalues eigenbases



\(M=\begin {bmatrix} 1 & 0 & 1\\ 0 & 0 & 0\\ 1 & 0 & 1 \end {bmatrix} \) \(2,0,0\) \(\left \{ \frac {1}{\sqrt {2}}\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} ,\frac {1}{\sqrt {2}}\begin {bmatrix} -1\\ 0\\ 1 \end {bmatrix} ,\begin {bmatrix} 0\\ 1\\ 0 \end {bmatrix} \right \} \)



\(N=\begin {bmatrix} 2 & 1 & 1\\ 1 & 0 & -1\\ 1 & -1 & 2 \end {bmatrix} \) \(2,3,-1\) \(\left \{ \frac {1}{\sqrt {3}}\begin {bmatrix} -1\\ -1\\ 1 \end {bmatrix} ,\frac {1}{\sqrt {2}}\begin {bmatrix} 1\\ 0\\ 1 \end {bmatrix} ,\frac {1}{\sqrt {6}}\begin {bmatrix} -1\\ 2\\ 1 \end {bmatrix} \right \} \)



Looking at the above, we see that all basis are common (linear combinations of \(M\) eigenvectors associated with zero eigenvalue can be used to generate two of \(N\) eigenvectors).

4.5.10 key solution for HW 5

PDF