Example using conjugate directions

5.3 Example using conjugate directions

  5.3.1 First method, Direct calculus
  5.3.2 Second method, basic Conjugate direction
  5.3.3 Third method. Conjugate gradient
  5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves
  5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

This example is solved in number of ways. Given quadratic function \(J\left ( x_{1},x_{2}\right ) =\frac{1}{2}x^{T}Ax+bx^{T}\) where \(x=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} \). To ﬁnd \(x^{\ast }\), which minimizes \(J\left ( x\right ) \). Let \(A=\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix} \) and \(b=\begin{bmatrix} -1\\ 1 \end{bmatrix} \)

5.3.1 First method, Direct calculus

\begin{align*} \nabla J\left ( x\right ) & =0\\ Ax+b & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \\\begin{bmatrix} 4x_{1}+2x_{2}-1\\ 2x_{1}+2x_{2}+1 \end{bmatrix} & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}

Solving gives \[ x^{\ast }=\begin{bmatrix} x_{1}\\ x_{2}\end{bmatrix} =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \]

5.3.2 Second method, basic Conjugate direction

Since \(A\) is of size \(n=2\), then this will converge in \(2\) steps using conjugate directions. let \(x^{0}=\begin{bmatrix} 0\\ 0 \end{bmatrix} \). Let ﬁrst direction be \[ v^{0}=\begin{bmatrix} 1\\ 0 \end{bmatrix} \] Then\[ h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ 0 \end{bmatrix} }=\frac{1}{4}\] Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} +\frac{1}{4}\begin{bmatrix} 1\\ 0 \end{bmatrix} =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} \end{align*}

Second step. We need to ﬁnd \(v^{1}\). Using conjugate mutual property of \(A\), we solve for \(v^{1}\) using\begin{align*} \left ( v^{0}\right ) ^{T}Av^{1} & =0\\\begin{bmatrix} 1 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} v_{1}\\ v_{2}\end{bmatrix} & =0\\ 4v_{1}+2v_{2} & =0 \end{align*}

Let \(v_{1}=1\) then \(v_{2}=-2\) and hence \[ v^{1}=\begin{bmatrix} 1\\ -2 \end{bmatrix} \] Now we ﬁnd the next optimal step\[ h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 1 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 1 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -2 \end{bmatrix} }=\frac{3}{4}\] Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} \frac{1}{4}\\ 0 \end{bmatrix} +\frac{3}{4}\begin{bmatrix} 1\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is \(x^{\ast }\) that we found in ﬁrst method. Using \(n=2\) steps as expected. In implementation, we will have to check we converged by looking at \(\nabla J\left ( x^{2}\right ) \) which will be\begin{align*} \nabla J\left ( x^{2}\right ) & =Ax^{2}+b\\ & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} \end{align*}

As expected.

5.3.3 Third method. Conjugate gradient

The diﬀerence here is that we ﬁnd \(v^{i}\) on the ﬂy after each step. Unlike the conjugate direction method, where \(v^{i}\) are all pre-computed. Let \(v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} \). In this method, we always pick \(v^{0}=\nabla \left ( J\left ( x^{0}\right ) \right ) \), where \(x^{0}\) is the starting guess vector. First step

\[ h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1 \] Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now we ﬁnd the mutual conjugate \(v^{1}\) as follows\begin{align*} \beta _{0} & =\frac{\left ( \nabla J\left ( x^{1}\right ) \right ) ^{T}Av^{0}}{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( Ax^{1}+b\right ) \left ( Av^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) ^{T}\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }\\ & =\frac{\begin{bmatrix} 1\\ 1 \end{bmatrix} ^{T}\begin{bmatrix} -2\\ 0 \end{bmatrix} }{2}=\frac{-2}{2}=-1 \end{align*}

Hence\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) -\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} 0\\ -2 \end{bmatrix} \end{align*}

Now that we found \(v^{1}\), we repeat the process. \[ h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} 0 & -2 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} 0 & -2 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}\] Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. Converged in \(2\) steps also.

5.3.4 Fourth method. Conjugate gradient using Fletcher-Reeves

In this method \[ \beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\nabla J\left ( u^{k+1}\right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }=\frac{\left \Vert \nabla J\left ( u^{k+1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{k}\right ) \right \Vert ^{2}}\] We also start here with \(v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} \) in this example. \[ h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1 \] Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now ﬁnd the mutual conjugate \(v^{1}\) as follows, using Fletcher-Reeves formula \[ \beta _{0}=\frac{\left \Vert \nabla J\left ( u^{1}\right ) \right \Vert ^{2}}{\left \Vert \nabla J\left ( u^{0}\right ) \right \Vert ^{2}}=\frac{\left \Vert Ax^{1}+b\right \Vert ^{2}}{\left \Vert Ax^{0}+b\right \Vert ^{2}}=\frac{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}{\left \Vert \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) \right \Vert ^{2}}=\frac{\left \Vert \begin{bmatrix} 1\\ 1 \end{bmatrix} \right \Vert ^{2}}{\left \Vert \begin{bmatrix} -1\\ 1 \end{bmatrix} \right \Vert ^{2}}=\frac{\left ( \sqrt{2}\right ) ^{2}}{\left ( \sqrt{2}\right ) ^{2}}=1 \] \begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}

Now that we found \(v^{1}\), we repeat the process. \[ h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}\] Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. It converges in \(2\) steps also.

5.3.5 Fifth method. Conjugate gradient using Polak-Ribiere

In this method \[ \beta _{k}=\frac{\nabla J\left ( u^{k+1}\right ) ^{T}\left ( \nabla J\left ( u^{k+1}\right ) -\nabla J\left ( u^{k}\right ) \right ) }{\nabla J\left ( u^{k}\right ) ^{T}\nabla J\left ( u^{k}\right ) }\] We also start here with \(v^{0}=\nabla J\left ( u^{0}\right ) =\begin{bmatrix} -1\\ 1 \end{bmatrix} \) in this example. \[ h_{0}=\frac{-\left ( v^{0}\right ) ^{T}\nabla J\left ( x^{0}\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\left ( v^{0}\right ) ^{T}\left ( Ax^{0}+b\right ) }{\left ( v^{0}\right ) ^{T}Av^{0}}=\frac{-\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{-2}{2}=-1 \] Hence\begin{align*} x^{1} & =x^{0}+h_{0}v^{0}\\ & =\begin{bmatrix} 0\\ 0 \end{bmatrix} -1\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ -1 \end{bmatrix} \end{align*}

Now we ﬁnd the mutual conjugate \(v^{1}\) direction as follows, using Polak-Ribiere formula \[ \beta _{0}=\frac{\nabla J\left ( u^{1}\right ) ^{T}\left ( \nabla J\left ( u^{1}\right ) -\nabla J\left ( u^{0}\right ) \right ) }{\nabla J\left ( u^{0}\right ) ^{T}\nabla J\left ( u^{0}\right ) }\]

But

\begin{align*} \nabla J\left ( u^{1}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} 1\\ 1 \end{bmatrix} \\ \nabla J\left ( u^{0}\right ) & =\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ 0 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} =\begin{bmatrix} -1\\ 1 \end{bmatrix} \end{align*}

Hence

\[ \beta _{0}=\frac{\begin{bmatrix} 1 & 1 \end{bmatrix} \left ( \begin{bmatrix} 1\\ 1 \end{bmatrix} -\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -1 & 1 \end{bmatrix}\begin{bmatrix} -1\\ 1 \end{bmatrix} }=\frac{2}{2}=1 \] Hence

\begin{align*} v^{1} & =-\nabla J\left ( x^{1}\right ) +\beta _{0}v^{0}\\ & =-\left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) +\left ( 1\right ) \begin{bmatrix} -1\\ 1 \end{bmatrix} \\ & =\begin{bmatrix} -2\\ 0 \end{bmatrix} \end{align*}

Now that we found \(v^{1}\), we repeat the process. \[ h_{1}=\frac{-\left ( v^{1}\right ) ^{T}\nabla J\left ( x^{1}\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\left ( v^{1}\right ) ^{T}\left ( Ax^{1}+b\right ) }{\left ( v^{1}\right ) ^{T}Av^{1}}=\frac{-\begin{bmatrix} -2 & 0 \end{bmatrix} \left ( \begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 1\\ -1 \end{bmatrix} +\begin{bmatrix} -1\\ 1 \end{bmatrix} \right ) }{\begin{bmatrix} -2 & 0 \end{bmatrix}\begin{bmatrix} 4 & 2\\ 2 & 2 \end{bmatrix}\begin{bmatrix} 0\\ -2 \end{bmatrix} }=\frac{2}{8}=\frac{1}{4}\] Hence\begin{align*} x^{2} & =x^{1}+h_{1}v^{1}\\ & =\begin{bmatrix} 1\\ -1 \end{bmatrix} +\left ( \frac{1}{4}\right ) \begin{bmatrix} 0\\ -2 \end{bmatrix} \\ & =\begin{bmatrix} 1\\ -\frac{3}{2}\end{bmatrix} \end{align*}

Which is the same as with conjugate direction method. Converges in \(2\) steps also as expected

[next] [prev] [prev-tail] [front] [up]