2.2 Lecture monday April 17 2007

  2.2.1 Section 5.4
  2.2.2 Homework Solution for section 5.4
  2.2.3 code

UP

PDF (letter size)

PDF (legal size)

2.2.1 Section 5.4

   2.2.1.1 Theorem 1: Singular value decomposition
   2.2.1.2 Pseudo Inverse
   2.2.1.3 Theorem 2: Minimal solution to \(A\vec {x}=\vec {b}\) and the pseudo inverse \(A^{+}\)
   2.2.1.4 Theorem 3: Penrose properties for any matrix \(A\)
   2.2.1.5 Theorem 4: \(A^{+}\) satisfies Penrose properties hence unique
   2.2.1.6 Theorem 5: On SVD properties
   2.2.1.7 Theorem 6: Reduced (or economical) SVD
   2.2.1.8 Theorem 7: On orthonormal Bases

2.2.1.1 Theorem 1: Singular value decomposition

Given any arbitrary matrix \(A_{m\times n}\) it can be factored into 3 matrices as follows \(A_{m\times n}=P_{m\times m}D_{m\times n}Q_{n\times n}\) where \(P\) is a unitary matrix (\(P^{H}=P^{-1}\) or \(P^{H}P=I\)), and \(Q\) is also unitary matrix.

These are the steps to do SVD

  1. Find the rank of \(A\), say \(r\)

  2. Let \(B=A_{n\times m}^{H}A_{m\times n}\), hence \(B_{n\times n}\) is a square matrix, and it is semi positive definite, hence its eigenvalues will all be \(\geq 0\). Find the eigenvalues of \(B\), call these \(\sigma _{i}^{2}\). There will be \(n\) such eigenvalues since \(B\) is of order \(n\times n\). But only \(r\) of these will be positive, and \(n-r\) will be zero. Arrange these eigenvalues such that the first \(r\) non-zero eigenvalues come first, followed by the zero eigenvalues: \(\overset {n\ eigenvalues}{\overbrace {\sigma _{1}^{2},\sigma _{2}^{2},\cdots ,\sigma _{r}^{2},0,0,\cdots ,0}}\)

  3. Initialize matrix \(D_{m\times n}\) to be all zeros. Take the the first \(r\) eigenvalues from above (non-zero ones), and take the square root of each, hence we get \(\overset {r\ \text {singular values}}{\overbrace {\sigma _{1},\sigma _{2},\cdots ,\sigma _{r}}},\)and write these down the diagonal of \(D\) starting at \(D\left (1,1\right ) \), i.e. \(D\left (1,1\right ) =\sigma _{1},D\left (2,2\right ) =\sigma _{2},\cdots ,D\left (r,r\right ) =\sigma _{r}\). Notice that the matrix \(D\) need not square matrix. Hence we can obtain an arrangement such as the following for \(r=2\begin {pmatrix} \sigma _{1} & 0 & 0 & 0\\ 0 & \sigma _{2} & 0 & 0\\ 0 & 0 & 0 & 0 \end {pmatrix} \), where the matrix \(A\) was \(3\times 4\,\,\)for example.

  4. Now find each eigenvalue of \(A^{H}A\). For each eigenvalue, find the corresponding eigenvector. Call these eigenvectors \(\vec {u}_{1},\vec {u}_{2},\cdots ,\vec {u}_{n}\).

  5. Normalize the eigenvectors found in the above step. Now \(\vec {u}_{1},\vec {u}_{2},\cdots ,\vec {u}_{n}\) eigenvector will be an orthonormal set of vectors. Take the Hermitian of each eigenvector \(\vec {u}_{1}^{H},\vec {u}_{2}^{H},\cdots ,\vec {u}_{n}^{H}\,\)and make one of these vectors (now in row format instead of column format) go into a row in the matrix \(Q.\)i.e the first row of \(Q\) will be \(\vec {u}_{1}^{H}\), the second row of \(Q\) will be \(\vec {u}_{2}^{H}\), etc... \(\left (A^{H}A\right ) _{n\times n}\overset {\text {find eigenvectors and normalize}}{\rightarrow }\left \{ \vec {u}_{1},\vec {u}_{2},\cdots ,\vec {u}_{r},\overset {\text {ortho basis (n-r) for N(A)}}{\overbrace {\vec {u}_{r+1},\cdots ,\vec {u}_{n}}}\right \} \rightarrow Q_{n\times n}=\begin {bmatrix} \vec {u}_{1}^{T}\\ \vec {u}_{2}^{T}\\ \vdots \\ \vec {u}_{r}^{T}\\ \vec {u}_{r+1}^{T}\\ \vdots \\ \vec {u}_{n}^{T}\end {bmatrix} \)

  6. Now we need to find a set of \(m\) orthonormal vectors, these will be the columns of the matrix \(P\). There are 2 ways to do this. First the textbook way, and then another way which I think is simpler.

    1. The textbook method is as follows: find a set of \(r\) orthonormal eigenvector \(\vec {v}_{i}=\frac {1}{\sigma _{i}}A\vec {u}_{i}\), for \(i=1\cdots r\). Notice that here we only use the first \(r\) vectors found in step 5. Take each one of these \(\vec {v}_{i}\) vectors and make them into columns of \(P\). But since we need \(m\) columns in \(P\) not just \(r\), we need to come up with \(m-r\) more basis vectors such that all the \(m\) vectors form an orthonormal set of basis vectors for the row space of \(A\), ie. \(\mathbb {C} ^{m}\). If doing this by hand, it is easy to find the this \(m-r\) by inspection. In a program, we could use the same process we used with Gram-Schmidt, where we learned how find a new vector which is orthonormal to a an existing set of other vectors.

    2. An easier approach is: Find \(AA^{H}\) (do not confused with how we did this in step 4 where we did \(A^{H}A\)). This will be an \(m\times m\) matrix. Now find each eigenvalue. For each eigenvalue, find the corresponding eigenvector. Now normalize these basis vectors. These will now be an orthonormal eigenvectors \(\vec {v}_{1},\vec {v}_{2},\cdots ,\vec {v}_{m}\). Now as in the above step, these vectors will becomes the columns of the matrix \(P\). The difference in this approach is that we do not need to use Gram-Schmidt to find the \(m-r\) eigenvectors since we will obtain the \(m\) eigenvectors right away.   \(\left (AA^{H}\right ) _{m\times m}\overset {\text {find eigenvectors and normalize}}{\rightarrow }\left \{ \overset {\text {r ortho basis for range A}}{\overbrace {\vec {v}_{1},\vec {v}_{2},\cdots ,\vec {v}_{r}}},\vec {v}_{r+1},\cdots ,\vec {v}_{m}\right \} \rightarrow P_{m\times m}=\begin {bmatrix} \vec {v}_{1} & \vec {v}_{2} & \cdots & \vec {v}_{r} & \vec {v}_{r+1} & \cdots & \vec {v}_{m}\end {bmatrix} \)

  7. This completes the factorization, now we have \(A=PDQ\)

In Matlab, to find SVD, use the command \([P,D,Q]=svd(A).\)

The following diagram help illustrate the SVD process described above.

pict
Figure 2.1:SVD

Remarks:

  1. \(\sigma _{1},\sigma _{2},\cdots ,\sigma _{n}\) are called the singular values of \(A\).

  2. the SVD factorization is not unique as \(\sigma _{1},\sigma _{2},\cdots ,\sigma _{r}\) can be ordered differently. Also the choice of \(m-r\) orthonormal basis for the \(P\) matrix is arbitrary.

2.2.1.2 Pseudo Inverse

Given an arbitrary matrix \(A\) to find its pseudo inverse, called \(A^{+}\) we start by first find the SVD of \(A\), then write

\begin {equation} A^{+}=Q^{H}D^{+}P^{H} \tag {1} \end {equation}

Where, since \(D\) is diagonal \(\begin {pmatrix} \sigma _{1} & 0 & 0\\ 0 & \sigma _{2} & 0\\ 0 & 0 & \ddots \end {pmatrix} \), then \(D^{+}=\) \(\begin {pmatrix} \frac {1}{\sigma _{1}} & 0 & 0\\ 0 & \frac {1}{\sigma _{2}} & 0\\ 0 & 0 & \ddots \end {pmatrix} \), and so finding the pseudo inverse of a matrix is straightforward if we have the SVD factorization of the matrix.  To show equation (1): Since \[ A=PDQ \] Then pre multiply both sides by \(P^{H}\) we obtain

\[ P^{H}A=P^{H}PDQ \]

But \(P^{H}=P^{-1}\) since \(P\) is unitary matrix, then the above becomes

\[ P^{H}A=DQ \]

Now post-multiply both sides by \(Q^{H}\)

\[ P^{H}AQ^{H}=DQQ^{H}\]

But \(Q^{H}=Q^{-1}\) since \(Q\) is unitary matrix, then the above becomes

\[ P^{H}AQ^{H}=D \]

Hence

\[ D^{+}=\left (P^{H}AQ^{H}\right ) ^{+}\]

To find definition for \(A^{+}\), start from \(Ax=b\)

\begin {align*} Ax & =b\\ A^{H}Ax & =A^{H}b\\ x & =\left (A^{H}A\right ) ^{-1}A^{H}b \end {align*}

Hence \begin {align*} A^{+} & =\left (A^{H}A\right ) ^{-1}A^{H}\\ & =\left (\left (PDQ\right ) ^{H}\left (PDQ\right ) \right ) ^{-1}\left ( PDQ\right ) ^{H}\\ & =\left (\left (Q^{H}D^{H}P^{H}\right ) \left (PDQ\right ) \right ) ^{-1}\left (Q^{H}D^{H}P^{H}\right ) \\ & =\left (\left (PDQ\right ) ^{-1}\left (Q^{H}D^{H}P^{H}\right ) ^{-1}\right ) \left (Q^{H}D^{H}P^{H}\right ) \\ & =\left (\left (Q^{-1}D^{-1}P^{-1}\right ) \left (P^{-H}D^{-H}Q^{-H}\right ) \right ) \left (Q^{H}D^{H}P^{H}\right ) \\ & =\left (Q^{-1}D^{-1}P^{-1}P^{-H}D^{-H}Q^{-H}\right ) \left (Q^{H}D^{H}P^{H}\right ) \\ & =\left (Q^{-1}D^{-1}D^{-H}Q^{-H}\right ) \left (Q^{H}D^{H}P^{H}\right ) \\ & =Q^{-1}D^{-1}D^{-H}Q^{-H}Q^{H}D^{H}P^{H}\\ & =Q^{-1}D^{-1}D^{-H}D^{H}P^{H}\\ & =Q^{-1}D^{-1}P^{H}\\ & =Q^{H}D^{-1}P^{H} \end {align*}

But \(D^{-1}=D^{+}\) hence the above becomes

\[ A^{+}=Q^{H}D^{+}P^{H}\]

which is equation (1).

Note: \(A^{+}A=I^{+}\), where \(I^{+}\) has the form\(\begin {pmatrix} 1 & 0 & 0 & 0 & 0\\ 0 & \ddots & 0 & 0 & 0\\ 0 & 0 & 1 & 0 & 0\\ 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 \end {pmatrix} \). in other words, it does not have \(1\) on all the diagonal elements as the normal \(I\) would.

2.2.1.3 Theorem 2: Minimal solution to \(A\vec {x}=\vec {b}\) and the pseudo inverse \(A^{+}\)

if \(\hat {x}=A^{+}\vec {b}\) then \(\hat {x}\) is the minimal solution of \(A\vec {x}=\vec {b}\)

proof:

let \(c=P^{H}b\), \(y=Qx\)

\begin {align*} \rho & =\inf _{x}\left \Vert Ax-b\right \Vert _{2}\\ & =\inf _{x}\left \Vert PDQx-b\right \Vert _{2}\\ & =\inf _{x}\left \Vert P^{H}PDQx-P^{H}b\right \Vert _{2}\\ & =\inf _{x}\left \Vert DQx-P^{H}b\right \Vert _{2}\\ & =\inf _{x}\left \Vert Dy-c\right \Vert _{2} \end {align*}

Since \(D=\begin {pmatrix} \sigma _{1} & & & \\ & \sigma _{2} & & \\ & & \sigma _{r} & \\ & & & 0 \end {pmatrix} \) then \[ \left \Vert Dy-c\right \Vert _{2}^{2}=\sum _{i=1}^{r}\left (\sigma _{i}y_{i}-c_{i}\right ) ^{2}+\sum _{i=r+1}^{n}c_{i}^{2}\]

This is minium when \(y_{i}=\frac {\sigma _{i}}{c_{i}},\)where \(i=1\cdots r\), then will make the first term in the RHS above go to zero. Hence

\[ \rho =\sqrt {\sum _{i=r+1}^{n}c_{i}^{2}}\]

The vector \(\vec {y}\) which gives this minium solution has \(y_{r+1},y_{r+2},\cdots ,y_{n}=0\), hence \[ y=D^{+}c \]

But since \(y=Qx\), hence \(D^{+}c=Qx\) or \(x=Q^{H}D^{+}c\), but \(c=P^{H}b\), then \begin {align*} x & =Q^{H}D^{+}P^{H}b\\ & =A^{+}b \end {align*}

Hence \[ A^{+}=Q^{H}D^{+}P^{H}\]

2.2.1.4 Theorem 3: Penrose properties for any matrix \(A\)

For any \(m\times n\) matrix \(A\) there is at least one matrix \(X\) with the following 4 properties

  1. \(AXA=A\)

  2. \(XAX=X\)

  3. \(\left (AX\right ) ^{H}=AX\)

  4. \(\left (XA\right ) ^{H}=XA\)

We need to show that the matrix \(X\) is unique (with respect to \(A\)). The proof is by contradiction. We assume that \(A\) has associated with it 2 matrices with the above properties, \(X\) and \(Y\) and we will show that \(X=Y\), hence \(X\) is unique for \(A.\)

The proof starts by saying \(X=XAX\) and through number of substitution steps using the above 4 we obtain \(Y\) as follows

\begin {align*} X & =X\overset {AYA}{\overbrace {A}}X\ \ \ \ \ \ \ \\ & =XAYAX\\ & =X\overset {AYA}{\overbrace {A}}Y\overset {AYA}{\overbrace {A}}X\\ & =XAYAYAYAX\\ & =\left (XA\right ) ^{H}\left (YA\right ) ^{H}Y\left (AY\right ) ^{H}\left (AX\right ) ^{H}\ \ \ \ \text {using property 3,4}\\ & =A^{H}X^{H}A^{H}Y^{H}YY^{H}A^{H}X^{H}A^{H}\\ & =\left (AXA\right ) ^{H}Y^{H}YY^{H}\left (AXA\right ) ^{H}\\ & =\overbrace {\left (AXA\right ) }^{H}Y^{H}YY^{H}\overbrace {\left ( AXA\right ) }^{H}\\ & =\relax (A) ^{H}Y^{H}YY^{H}\relax (A) ^{H}\text { \ \ \ \ \ \ \ property 1}\\ & =\left (YA\right ) ^{H}Y\left (AY\right ) ^{H}\\ & =YAYAY\text { \ \ \ property 4 and 3}\\ & =YAY\text { \ \ property 2}\\ & =Y\text { \ \ property 2} \end {align*}

Hence \(X=Y\), so \(X\) is unique.

2.2.1.5 Theorem 4: \(A^{+}\) satisfies Penrose properties hence unique

Theorem: The pseudo inverse of a matrix has four Penrose properties. hence each matrix has a unique pseudo inverse \(A^{+}\)

proof: Let \(A\) be any matrix, its SVD is \(A=PDQ\), and we showed before that \(A^{+}=Q^{H}D^{+}P^{H}\)

note: \(DD^{+}D=D\)

Let us now show that each property  of Penrose is satisfied. In this case \(X\) is our \(A^{+}\). Hence for property 1 we need to show that \(AA^{+}A=A.\)

\begin {align*} AA^{+}A & =PDQ\overset {A^{+}}{\overbrace {Q^{H}D^{+}P^{H}}}PDQ\\ & =PD\overset {}{D^{+}P^{H}}PDQ\\ & =PDD^{+}DQ\\ & =PDI^{+}Q\\ & =PDQ\\ & =A \end {align*}

To show property 2, we need to show that \(A^{+}AA^{+}=A^{+}\)

\begin {align*} A^{+}AA^{+} & =\left (Q^{H}D^{+}P^{H}\right ) A\left (Q^{H}D^{+}P^{H}\right ) \\ & =\left (Q^{H}D^{+}P^{H}\right ) PDQ\left (Q^{H}D^{+}P^{H}\right ) \\ & =\left (Q^{H}D^{+}\right ) D\left (D^{+}P^{H}\right ) \\ & =Q^{H}D^{+}DD^{+}P^{H}\\ & =Q^{H}I^{+}D^{+}P^{H}\\ & =Q^{H}D^{+}P^{H}\\ & =A^{+} \end {align*}

To show property 3, need to show \(\left (AA^{+}\right ) ^{H}=AA^{+}\)

\begin {align*} \left (AA^{+}\right ) ^{H} & =\left (\left (PDQ\right ) \left ( Q^{H}D^{+}P^{H}\right ) \right ) ^{H}\\ & =\left (\left (Q^{H}D^{+}P^{H}\right ) ^{H}\left (PDQ\right ) ^{H}\right ) \\ & =\left (P\left (D^{+}\right ) ^{H}Q\left (PDQ\right ) ^{H}\right ) \\ & =\left (P\left (D^{+}\right ) ^{H}Q\left (Q^{H}D^{H}P^{H}\right ) \right ) \\ & =P\left (D^{+}\right ) ^{H}QQ^{H}D^{H}P^{H}\\ & =PD^{+}QQ^{H}D^{H}P^{H}\\ & =PD^{+}QQ^{H}D^{H}P^{H}\\ & =PDQQ^{H}D^{+}P^{H}\\ & =\left (PDQ\right ) \left (Q^{H}D^{+}P^{H}\right ) \\ & =AA+ \end {align*}

For property (4) need to show \(\left (A^{+}A\right ) ^{H}=A^{+}A\)

\begin {align*} \left (A^{+}A\right ) ^{H} & =\left (\left (Q^{H}D^{+}P^{H}\right ) \left (PDQ\right ) \right ) ^{H}\\ & =\left (PDQ\right ) ^{H}\left (Q^{H}D^{+}P^{H}\right ) ^{H}\\ & =Q^{H}D^{H}P^{H}P\left (D^{+}\right ) ^{H}Q\\ & =Q^{H}DP^{H}P\left (D^{+}\right ) Q\\ & =Q^{H}D\left (D^{+}\right ) Q\\ & =Q^{H}D^{+}DQ\\ & =Q^{H}D^{+}P^{H}PDQ\\ & =A^{+}A \end {align*}

Hence Pseudocode is unique.

2.2.1.6 Theorem 5: On SVD properties

Let \(A\) have svd \(A=PDQ\ \), then the following is true

  1. rank of \(A\) is \(r\)

  2. \(\left \{ \vec {u}_{r+1},\vec {u}_{r+2},\cdots ,\vec {u}_{n}\right \} \) is an orthonormal base for null space of \(A\)

  3. \(\left \{ \vec {v}_{1},\vec {v}_{2},\cdots ,\vec {v}_{r}\right \} \) is an orthonormal base for range of of \(A\)

  4. \(\left \Vert A\right \Vert _{2}=\max _{1\leq i\leq n}\left \vert \sigma _{i}\right \vert \)

proof:

  1. Since \(A=PDQ\) and since \(P,Q\) are invertible matrices (by construction, they are made up of eigenvectors from a set of distinct eigenvalues, hence they must be orthogonal eigenvectors). Therefor, the rank of \(A\) is decided by the rank of \(D\). But \(D\) is all zeros except for the values \(\sigma _{i}\) each on a separate column and on separate row. Since there are \(r\) distinct values of those, hence \(D\) has \(r\) vectors that are linearly independent. Hence the rank of \(A\) is \(r.\)

  2. Since \(A\vec {u}_{i}=\vec {0}\) for each vector \(i=\left (r+1\right ) \cdots n\), and these \(\vec {u}_{i}\) vectors form an orthogonal set of size \(n-r\), and since the dimension of the null-space(A) is \(n-r\) hence they span the whole null-space(A). Then they can be used as a basis for the null space of \(A\) (recall, the null-space of \(A\) is the space in which \(A\vec {x}=\vec {0}\))

  3. Since \(\vec {v}_{i}=\frac {A\vec {u}_{i}}{\sigma _{i}}\neq \vec {0}\), for \(i=1\cdots r\), and these \(\vec {v}_{i}\) vectors form an orthogonal set of basis, and since the dimension of the Range(A) is \(r\) hence they span the whole Range(A). Then they can be used as a basis for the Range(\(A).\)Another proof of this, is to use the construction of the \(\vec {v}_{i}\) vector from \(AA^{H}\). But we did not use this method, so I will leave this out for now.

  4. \begin {align*} \left \Vert A\right \Vert _{2} & =\sup _{\left \Vert x\right \Vert _{2}=1}\left \{ \left \Vert A\vec {x}\right \Vert _{2}\right \} \\ & =\sup _{\left \Vert x\right \Vert _{2}=1}\left \Vert PDQ\vec {x}\right \Vert _{2}\\ & =\sup _{\left \Vert y\right \Vert _{2}=1}\left \Vert D\vec {y}\right \Vert _{2}\\ & =\sup _{\left \Vert y\right \Vert _{2}=1}\sqrt {\sum _{i=1}^{r}\left ( \sigma _{i}y_{i}\right ) ^{2}}\\ & =\max _{1\leq i\leq r}\ \left \vert \sigma _{i}\right \vert \end {align*}

2.2.1.7 Theorem 6: Reduced (or economical) SVD

Recall in the above standard SVD that when we build \(P\) matrix, we used the first \(r\) vectors \(\vec {u}_{i}\) to generate \(\vec {v}_{i}\) from, by doing \(\vec {v}_{i}=\frac {A\vec {u}_{i}}{\sigma _{i}}\), then we used these \(\vec {v}_{i}\) as the columns of \(P,\) but since \(P\) was of size \(m\times m\,\,\)we had to come up with \(m-r\,\) more orthogonal vectors to fill in the \(P\) matrix. In the economical SVD, we stop when we fill in \(r\) vectors in \(P\). Hence \(P\) will be of size \(m\times r\). Similarly for \(D\) we stop at the \(r\) singular value. Hence in this case \(D\) will be a diagonal matrix \(r\times r\). And for the \(Q\) matrix, we also generate \(r\)  orthonormal basis \(\vec {u}_{i}\)

Hence the economical \(SVD\) follows the same steps as the normal SVD, except we stop when we obtain \(r\) basis. The matrices will therefor be smaller, hence the name economical. The factorization will then be \(A_{m\times n}=P_{m\times r}D_{r\times r}Q_{r\times n}\) (for the case when \(m\geq n\geq r\))

I am not sure now in what application one would have to generate the full SVD vs. the economical SVD. But in Matlab, there is an optional to select either factorization. Type svd(A,’econ’).

2.2.1.8 Theorem 7: On orthonormal Bases

Let \(L\) be a linear transformation from \(\mathbb {C} ^{m}\) to \(\mathbb {C} ^{n}\) then there are orthonormal bases \(\vec {u}_{1},\vec {u}_{2},\cdots ,\vec {u}_{n}\), for \(\mathbb {C} ^{m}\) and \(v_{1},v_{2},\cdots ,v_{n}\) for \(\mathbb {C} ^{n}\) s.t. \(L\vec {u}_{i}=\left \{ \begin {array} [c]{ccc}\sigma _{i}\vec {v}_{i} & & 1\leq i\leq \min \left (m,n\right ) \\ 0 & & \min \left (m,n\right ) <i\leq m \end {array} \right . \)

proof: The proof as given in the textbook for this theorem is clear enough as outlined. Please see page 295.

2.2.2 Homework Solution for section 5.4

   2.2.2.1 Problem section 5.4, 2(a)
   2.2.2.2 Problem 5.4, 10
   2.2.2.3 Problem 5.4, 23
   2.2.2.4 Problem 5.4, 34
   2.2.2.5 Problem 5.4, 39

2.2.2.1 Problem section 5.4, 2(a)

question: Find the minimal solution for \(x_{1}x_{2}=b\)

answer:

First write the problem as \[\begin {bmatrix} 1 & 1 \end {bmatrix}\begin {bmatrix} x_{1}\\ x_{2}\end {bmatrix} =\left [ b\right ] \]

Minimal solution is \(\vec {x}=A^{+}b\), so we need to find \(A^{+}\). Find \(A=PDQ\), then \(A^{+}=Q^{H}D^{+}P^{H}\)

First find the set of \(\vec {u}_{i}\) vectors to go to the \(Q\) matrix. I will use the economical SVD method.

\begin {align*} A^{H}A & =\begin {bmatrix} 1\\ 1 \end {bmatrix}\begin {bmatrix} 1 & 1 \end {bmatrix} \\ & =\begin {bmatrix} 1 & 1\\ 1 & 1 \end {bmatrix} \end {align*}

Hence \(r=1\)

Hence \(\left \vert A-\lambda I\right \vert =\begin {vmatrix} 1-\lambda & 1\\ 1 & 1-\lambda \end {vmatrix} =0\rightarrow \left (1-\lambda \right ) ^{2}-1=0\rightarrow 1+\lambda ^{2}-2\lambda -1=0\rightarrow \lambda \left (\lambda -2\right ) =0\)

Hence \(\lambda _{1}=2,\lambda _{2}=0\rightarrow \sigma _{1}=\sqrt {2},\sigma _{2}=0\)

Find eigenvectors \(\vec {u}_{1},\vec {u}_{2}\).

For \(\lambda _{1}=2\rightarrow \begin {bmatrix} 1-2 & 1\\ 1 & 1-2 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \rightarrow \begin {bmatrix} -1 & 1\\ 1 & -1 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \)

\(\rightarrow \begin {bmatrix} -1 & 1\\ 0 & 0 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \Rightarrow -y_{1}+y_{2}=0\Rightarrow \vec {u}_{1}=\begin {bmatrix} 1\\ 1 \end {bmatrix} \overset {\text {normalize\ norm 2}}{\rightarrow }\begin {bmatrix} \frac {1}{\sqrt {2}}\\ \frac {1}{\sqrt {2}}\end {bmatrix} \)

For \(\lambda _{2}=0\rightarrow \begin {bmatrix} 1-0 & 1\\ 1 & 1-0 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \rightarrow \begin {bmatrix} 1 & 1\\ 1 & 1 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \)

\(\rightarrow \begin {bmatrix} 1 & 1\\ 0 & 0 \end {bmatrix}\begin {bmatrix} y_{1}\\ y_{2}\end {bmatrix} =\begin {bmatrix} 0\\ 0 \end {bmatrix} \Rightarrow y_{1}+y_{2}=0\Rightarrow \vec {u}_{2}=\begin {bmatrix} -1\\ 1 \end {bmatrix} \overset {\text {normalize\ norm 2}}{\rightarrow }\begin {bmatrix} \frac {-1}{\sqrt {2}}\\ \frac {1}{\sqrt {2}}\end {bmatrix} \)

Hence \(Q=\begin {bmatrix} \vec {u}_{1}^{T}\\ \vec {u}_{2}^{T}\end {bmatrix} =\begin {bmatrix} 1 & 1\\ -1 & 1 \end {bmatrix} \frac {1}{\sqrt {2}}\)

Not to find the \(P\) matrix. \(AA^{H}=\begin {bmatrix} 1 & 1 \end {bmatrix}\begin {bmatrix} 1\\ 1 \end {bmatrix} =\left [ 2\right ] \)

The eigenvalue is \(2-\lambda =0\rightarrow \lambda =2\). Hence the eigenvector is \(2y_{1}=0\rightarrow y_{1}=\)anything\(\rightarrow \left [ 1\right ] \)

Hence the \(P\) matrix is \(\left [ 1\right ] \)

The \(D\) matrix is \(m\times n\), hence \(1\times 2\), then \(D=\begin {bmatrix} \sigma _{1} & 0 \end {bmatrix} \)

Hence this completes the SVD. We have that \begin {align*} \begin {bmatrix} 1 & 1 \end {bmatrix} & =\left [ 1\right ] \begin {bmatrix} \sigma _{1} & 0 \end {bmatrix}\begin {bmatrix} 1 & 1\\ -1 & 1 \end {bmatrix} \frac {1}{\sqrt {2}}\\ & =\left [ 1\right ] \begin {bmatrix} \sqrt {2} & 0 \end {bmatrix}\begin {bmatrix} 1 & 1\\ -1 & 1 \end {bmatrix} \frac {1}{\sqrt {2}}\\ & =\begin {bmatrix} \sqrt {2} & 0 \end {bmatrix}\begin {bmatrix} 1 & 1\\ -1 & 1 \end {bmatrix} \frac {1}{\sqrt {2}}\\ & =\begin {bmatrix} \sqrt {2} & \sqrt {2}\end {bmatrix} \frac {1}{\sqrt {2}}\\ & =\begin {bmatrix} 1 & 1 \end {bmatrix} \end {align*}

So the SVD is verified. Not find \begin {align*} A^{+} & =Q^{H}D^{+}P^{H}\\ & =\begin {bmatrix} 1 & 1\\ -1 & 1 \end {bmatrix} ^{H}\frac {1}{\sqrt {2}}\begin {bmatrix} \sqrt {2} & 0 \end {bmatrix} ^{+}\left [ 1\right ] ^{H}\\ & =\frac {1}{\sqrt {2}}\begin {bmatrix} 1 & -1\\ 1 & 1 \end {bmatrix}\begin {bmatrix} \frac {1}{\sqrt {2}}\\ 0 \end {bmatrix} \left [ 1\right ] \end {align*}

Notice that \(D^{+}\) mean we also take the conjugate transpose of \(D\) and then we take the reciprocal of each entry. Hence if \(D\) is \(m\times n\) then \(D^{+}\) is \(n\times m\)

\begin {align*} A^{+} & =\frac {1}{\sqrt {2}}\begin {bmatrix} \frac {1}{\sqrt {2}}\\ \frac {1}{\sqrt {2}}\end {bmatrix} \\ & =\begin {bmatrix} \frac {1}{2}\\ \frac {1}{2}\end {bmatrix} \end {align*}

Hence \begin {align*} \hat {x} & =A^{+}b\\ & =\begin {bmatrix} \frac {1}{2}\\ \frac {1}{2}\end {bmatrix} b \end {align*}

So the minimal solution is \(x_{1}=\frac {b}{2},x_{2}=\frac {b}{2}\)

2.2.2.2 Problem 5.4, 10

Problem: Prove the following properties of \(A^{+}\)

a)\(A^{++}=A\)

b)\(A^{+H}=A^{H+}\)

answer:

a)\(\left (A^{+}\right ) ^{+}=\left (\left (PDQ\right ) ^{+}\right ) ^{+}=\left (Q^{H}D^{+}P^{H}\right ) ^{+}=\left (P^{H}\right ) ^{H}\left ( D^{+}\right ) ^{+}\left (Q^{H}\right ) ^{H}\)

But \(\left (P^{H}\right ) ^{H}=P\) and \(\left (Q^{H}\right ) ^{H}=Q\), and the reciprocal of a reciprocal gives us back the original value, hence \(\left ( D^{+}\right ) ^{+}=D\)

Hence we have \(\left (P^{H}\right ) ^{H}\left (D^{+}\right ) ^{+}\left ( Q^{H}\right ) ^{H}=PDQ=A\)

b)\begin {align} \left (A^{+}\right ) ^{H} & =\left (Q^{H}D^{+}P^{H}\right ) ^{H}\nonumber \\ & =\left (P^{H}\right ) ^{H}\left (D^{+}\right ) ^{H}\left (Q^{H}\right ) ^{H}\nonumber \\ & =P\left (D^{+}\right ) ^{H}Q \tag {1} \end {align}

and \begin {align} A^{H+} & =\left (\left (PDQ\right ) ^{H}\right ) ^{+}\nonumber \\ & =\left (Q^{H}D^{H}P^{H}\right ) ^{+}\nonumber \\ & =\left (P^{H}\right ) ^{H}\left (D^{H}\right ) ^{+}\left (Q^{H}\right ) ^{H}\nonumber \\ & =P\left (D^{H}\right ) ^{+}Q \tag {2} \end {align}

Hence (1)=(2) if \(D^{H+}=D^{+H}\). But this is the case. Since \(D^{H+}=D\) and \(D^{+H}=D\)

2.2.2.3 Problem 5.4, 23

Problem: Let \(A\) be a square matrix with SVD \(PDQ\), prove that the characteristic polynomial of \(A=\pm \det \left (D-\lambda P^{H}Q^{H}\right ) =0\)

answer:

Let the characteristic polynomial of \(A\) be called \(\Delta \), hence we write\begin {align} \Delta & =\det \left (A-\lambda I\right ) \nonumber \\ & =\det \left (PDQ-\lambda I\right ) \nonumber \\ & =\det \left (P\left (DQ-\lambda P^{-1}\right ) \right ) \nonumber \\ & =\det \left (P\left (D-\lambda P^{-1}Q^{-1}\right ) Q\right ) \nonumber \\ & =\det \relax (P) \det \left (D-\lambda P^{-1}Q^{-1}\right ) \det \relax (Q) \nonumber \end {align}

But \(Q^{-1}=Q^{H}\) and \(P^{-1}=P^{H}\) hence the above becomes

\[ \Delta =\det \relax (P) \det \left (D-\lambda P^{H}Q^{H}\right ) \det \relax (Q) \]

Now \(\Delta =0\), and also we know that \(\det \relax (P) \neq 0\) and \(\det \relax (Q) \neq 0\,\), this is becuase \(P\) and \(Q\) are unitary matrices. Hence this means that \[ \det \left (D-\lambda P^{H}Q^{H}\right ) =0 \]

2.2.2.4 Problem 5.4, 34

problem: prove that if \(A\) is symmetric then so is \(A^{+}\)

answer:

Assuming complex matrix then symmetric means \(A=A^{H}\) hence \begin {align} PDQ & =\left (PDQ\right ) ^{H}\nonumber \\ & =Q^{H}D^{H}P^{H}\nonumber \\ & =Q^{H}DP^{H} \tag {1} \end {align}

Hence \(PDQ=Q^{H}DP^{H}\rightarrow DQ=P^{-1}Q^{H}DP^{H}\rightarrow D=P^{-1}Q^{H}DP^{H}Q^{-1}\)

But since \(P^{H}=P^{-1}\) and \(Q^{H}=Q^{-1}\) then the above becomes

\begin {equation} D=P^{H}Q^{H}DP^{H}Q^{H} \tag {2} \end {equation}

now \[ A^{+}=Q^{H}D^{+}P^{H}\]

Sub (2) into the above equation we obtain

\begin {align*} A^{+} & =Q^{H}\left (P^{H}Q^{H}DP^{H}Q^{H}\right ) ^{+}P^{H}\\ & =Q^{H}\left (\left (P^{H}Q^{H}\right ) ^{H}D^{+}\left (P^{H}Q^{H}\right ) ^{H}\right ) P^{H}\\ & =Q^{H}\left (\left (QP\right ) D^{+}\left (QP\right ) \right ) P^{H}\\ & =Q^{H}QPD^{+}QPP^{H} \end {align*}

But \(Q^{H}Q=I\) and \(PP^{H}=I\)

Hence the above becomes

\begin {equation} A^{+}=PD^{+}Q \tag {3} \end {equation}

But \begin {align} \left (A^{+}\right ) ^{H} & =\left (Q^{H}D^{+}P^{H}\right ) ^{H}\nonumber \\ & =PD^{+}Q \tag {4} \end {align}

Compare (3) and (4), they are the same.

Hence \(A^{+}\) is symmetric.

2.2.2.5 Problem 5.4, 39

problem: prove that eigenvalues of positive semi definite matrix are nonnegative

answer:

positive semi definite means \(\vec {x}^{T}A\vec {x}\geq 0\) for all \(\vec {x}\neq \vec {0}\)

Hence \(\vec {x}^{T}A\vec {x}=\vec {x}^{T}\lambda \vec {x}=\lambda \vec {x}^{T}\vec {x}\)

But \(\vec {x}^{T}\vec {x}=\left \Vert \vec {x}\right \Vert ^{2}\)

Hence \(\vec {x}^{T}A\vec {x}=\lambda \left \Vert \vec {x}\right \Vert ^{2}\)

We are told the above is \(\geq 0\). Assume \(\vec {x}\neq 0\,,\) then we have \(\lambda \times \) the norm, which is positive quantity \(\geq 0\,\), hence this is possible only if \(\lambda \) was zero (for the =0 case) or \(\lambda >0\) for the \(>0\) case. It is not possible to have \(\lambda \) negative and multiply it by positive quantity and obtain a positive quantity.

Now Assume \(\vec {x}=0\), hence the norm is zero. Hence \(A\vec {x}=\vec {0}\) and so eigenvalues is zero. Hence eigenvalues can be either positive or zero. Hence nonnegative

2.2.3 code

This is some code in Matlab