2.4 HW 3

  2.4.1 residual error animation
  2.4.2 Animation for density plot animations of solvers for problem 1
  2.4.3 Animations of iterative solution
  2.4.4 Problem 1
  2.4.5 Problem 2
  2.4.6 Problem 3
  2.4.7 Problem 4
  2.4.8 References
  2.4.9 Source code
PDF (letter size)
PDF (legal size)

2.4.1 residual error animation

   2.4.1.1 SOR with \(h=2^{-7}\)
   2.4.1.2 SOR with \(h=2^{-5}\)

Animation of the residual error R as it changes during iterative solution. Tolerance used for these is \(h^2\), stopping criteria used is relative error ¡ tolerance Only SOR was done.

2.4.1.1 SOR with \(h=2^{-7}\)

These animations are large in size. (Click on any to see in actual size, will open in new window)





pict

pict

pict

pict





The above shows the residual changes when using SOR with \(h=2^{-7}\). Z-axis range is fixed between \(-6h \dots 6h\) (h=grid spacing)

Same as the one on the left, but z-axis is Automatic, full range of residual error, changes automatically

The above shows the residual changes when using SOR with \(h=2^{-5}\). Z-axis range is fixed between \(-6h..6h\) (h=grid spacing)

Same as the one on the left, but z-axis is Automatic, full range of residual error, changes automatically





2.4.1.2 SOR with \(h=2^{-5}\)

These animations are smaller in size. (Click on any to see in actual size, will open in new window)





pict

pict

pict

pict





The above shows the residual changes when using SOR with \(h=2^{-5}\). Z-axis range is fixed between \(-6h \dots 6h\) (\(h\)=grid spacing)

Same as the one on the left, but z-axis is Automatic, full range of residual error, changes automatically

The above shows the residual changes when using SOR with \(h=2^{-5}\). Z-axis range is fixed between \(-6h \dots 6h\) (h=grid spacing)

Same as the one on the left, but z-axis is Automatic, full range of residual error, changes automatically





2.4.2 Animation for density plot animations of solvers for problem 1

Animation plots below show solver as it updates each grid point by point in its main loop

pict

The above shows each of the solvers (Jacobi, Gauss-Seidel, SOR) in the process of updating the grid during one iteration of the main loop.

Notice the how GS and SOR solvers update the solution immediately (left to right, down to top numbering is used), while Jacobi solver updates the solution only at the end each iterative step.

This one below was done for \(h=2^{-3}\). Clicking on it shows animation.

pict

2.4.3 Animations of iterative solution

Animations of iterative solution (Click on image to see animation), Stopping criteria used is relative residual method, tolerance is \(h^2\)




Jacobi

Gauss-Seidel

SOR




pict

pict

pict




PDF (letter size)
PDF (legal size)

2.4.4 Problem 1

   2.4.4.1 Finding iterative matrix for the different solvers
   2.4.4.2 Tolerance used
   2.4.4.3 stopping criteria
   2.4.4.4 \(\omega \) used for SOR
   2.4.4.5 Algorithm details
   2.4.4.6 Structure of \(Au=f\)
   2.4.4.7 Result of computation
   2.4.4.8 Conclusions and summary

Problem statement

pict
Figure 2.16:Problem 1

Answer

Using the method of splitting, the iteration matrix \(T\) is found, for each method, for solving \(Au=f\).

Let \(A=M-N\), then \(Au=f\) becomes\begin{align} (M-N)u & = f\nonumber \\ Mu & = N u + f\nonumber \\ u^{[k+1]} & = M^{-1}N u^{[k]} + M^{-1} f \tag{1} \end{align}

2.4.4.1 Finding iterative matrix for the different solvers

The Jacobi method

For the Jacobi method \(M=D\) and \(N=L+U\), where \(D\) is the diagonal of \(A\), \(L\) is the negative of the strictly lower triangle matrix of \(A\) and \(U\) is negative of the strictly upper triangle matrix of \(A\). Hence(1) becomes\begin{align*} u^{[k+1]} & = D^{-1} (L+U) u^{[k]} + D^{-1} f\\ u^{[k+1]} & = T u^{[k]} + C \end{align*}

Where \(T\) is called the Jacobi iteration matrix. Since \(A = D-L-U\), hence \(L+U=D-A\), therefore the iteration matrix \(T\) can be written as\begin{align*} T & = D^{-1} (D-A)\\ & = I - D^{-1} A \end{align*}

or\begin{align*} T & = (I - D^{-1}A )\\ C & = D^{-1} f \end{align*}

The Gauss-Seidel method

For Gauss-Seidel, \(M=D-L\) and \(N=U\), hence (1) becomes\[ u^{[k+1]}=\left ( D-L\right ) ^{-1}U\ u^{[k]}+\left ( D-L\right ) ^{-1}f \] or\[ u^{[k+1]}=Tu^{[k]}+C \] where \begin{align*} T & =\left ( D-L\right ) ^{-1}U\\ C & =\left ( D-L\right ) ^{-1}f \end{align*}

The SOR method

For SOR, \(M=\frac{1}{\omega }\left ( D-\omega L\right ) \) and \(N=\frac{1}{\omega }\left ( \left ( 1-\omega \right ) D+\omega U\right ) \). Hence (1) becomes\begin{align*} u^{[k+1]} & =\left [ \frac{1}{\omega }\left ( D-\omega L\right ) \right ] ^{-1}\left [ \frac{1}{\omega }\left ( \left ( 1-\omega \right ) D+\omega U\right ) \right ] \ u^{[k]}+\left [ \frac{1}{\omega }\left ( D-\omega L\right ) \right ] ^{-1}f\\ & =\left ( D-\omega L\right ) ^{-1}\left ( \left ( 1-\omega \right ) D+\omega U\right ) u^{[k]}+\omega \left ( D-\omega L\right ) ^{-1}f\\ & =Tu^{\left [ k\right ] }+C \end{align*}

where \begin{align*} T & =\left ( D-\omega L\right ) ^{-1}\left ( \left ( 1-\omega \right ) D+\omega U\right ) \\ C & =\omega \left ( D-\omega L\right ) ^{-1} \end{align*}

Summary of iterative matrices used

This table summarizes the expression for the iterative matrix \(T\) and for the matrix \(C\) in the equation \(u^{\left [ k+1\right ] }=Tu^{\left [ k\right ] }+C\) for the different methods used.




\(method\) \(T\) \(C\)






Jacobi \((I-D^{-1}A)\) \(D^{-1}f\)



GS \(\left ( D-L\right ) ^{-1}U\) \(\left ( D-L\right ) ^{-1}f\)



SOR \(\left ( D-\omega L\right ) ^{-1}\left ( \left ( 1-\omega \right ) D+\omega U\right ) \) \(\omega \left ( D-\omega L\right ) ^{-1}\)



The discretized algebraic equation resulting from approximating \(\frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}=f\left ( x,y\right ) \) with Dirichlet boundary conditions is based on the use of the standard 5 point Laplacian \[ \frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}=\frac{1}{h^{2}}\left ( U_{i-1,j}+U_{i+1,j}+U_{i,j-1}+U_{i,j+1}-4U_{i,j}\right ) \] which has local truncation error \(O\left ( h^{2}\right ) \).

The notation used above is based on the following grid formating

pict
Figure 2.17:Grid notation

The derivation of the above formula is as follows: Consider a grid where the mesh spacing in the x-direction is \(h_{x}\) and in the y-direction is \(h_{y}\). Then \(\frac{\partial ^{2}u}{\partial x^{2}}\) is approximated, at position \(\left ( i,j\right ) \) by \(\frac{U_{i-1,j}-2U_{i,j}+U_{i+1,j}}{h_{x}^{2}}\) and \(\frac{\partial ^{2}u}{\partial y^{2}}\) is approximated, at position \(\left ( i,j\right ) \), by \(\frac{U_{i,j-1}-2U_{i,j}+U_{i,j+1}}{h_{y}^{2}}\) therefore \(\frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}\approx \frac{U_{i-1,j}-2U_{i,j}+U_{i+1,j}}{h_{x}^{2}}+\frac{U_{i,j-1}-2U_{i,j}+U_{i,j+1}}{h_{y}^{2}}\), and since \(\frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}=f_{i,j}\) at that position, this results in

\[ \frac{U_{i-1,j}-2U_{i,j}+U_{i+1,j}}{h_{x}^{2}}+\frac{U_{i,j-1}-2U_{i,j}+U_{i,j+1}}{h_{y}^{2}}=f_{i,j}\] Solving for \(U_{i,j}\) from the above gives\begin{align*} U_{i,j} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}h_{y}^{2}+U_{i+1,j}h_{y}^{2}+U_{i,j-1}h_{x}^{2}+U_{i,j+1}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\\ & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \left ( U_{i-1,j}+U_{i+1,j}\right ) h_{y}^{2}+\left ( U_{i,j-1}+U_{i,j+1}\right ) h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j} \end{align*}

The following table shows the formula for updating \(U_{i,j}\) using each of the 3 methods for the 2D case, under the assumption of uniform mesh spacing, i.e. when \(h_{x}=h_{y}\) which simplifies the above formula as given below



method Formula for updating \(U_{i,j}\), uniform mesh




Jacobi \(U_{i,j}^{\left [ k+1\right ] }=\frac{1}{4}\left ( U_{i-1,j}^{\left [ k\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) \)


GS \(U_{i,j}^{\left [ k+1\right ] }=\frac{1}{4}\left ( U_{i-1,j}^{\left [ k+1\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k+1\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) \)


SOR \(U_{i,j}^{\left [ k+1\right ] }=\frac{\omega }{4}\left ( U_{i-1,j}^{\left [ k+1\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k+1\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) +\left ( 1-\omega \right ) U_{i,j}^{\left [ k\right ] }\)


note that for SOR, the general formula, using nonuniform mesh can be derived as follows

\begin{align*} U_{(gs)i,j}^{\left [ k+1\right ] } & =\frac{1}{4}\left ( U_{i-1,j}^{\left [ k+1\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k+1\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) \\ U_{(sor)i,j}^{\left [ k+1\right ] } & =U_{i,j}^{\left [ k\right ] }+\omega \left ( U_{(gs)i,j}^{\left [ k+1\right ] }-U_{i,j}^{\left [ k\right ] }\right ) \end{align*}

The second formula above can be simplified by substituing the first equation into it to yield

\begin{align*} U_{(sor)i,j}^{\left [ k+1\right ] } & =U_{i,j}^{\left [ k\right ] }+\omega \left ( \frac{1}{4}\left ( U_{i-1,j}^{\left [ k+1\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k+1\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) -U_{i,j}^{\left [ k\right ] }\right ) \\ & =\frac{\omega }{4}\left ( U_{i-1,j}^{\left [ k+1\right ] }+U_{i+1,j}^{\left [ k\right ] }+U_{i,j-1}^{\left [ k+1\right ] }+U_{i,j+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) +\left ( 1-\omega \right ) U_{i,j}^{\left [ k\right ] } \end{align*}

Which is what shown in the table above. Using the same procedure, but for the general case,  results in\begin{align*} U_{(gs)i,j}^{\left [ k+1\right ] } & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}^{\left [ k+1\right ] }h_{y}^{2}+U_{i+1,j}^{\left [ k\right ] }h_{y}^{2}+U_{i,j-1}^{\left [ k+1\right ] }h_{x}^{2}+U_{i,j+1}^{\left [ k\right ] }h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\\ U_{(sor)i,j}^{\left [ k+1\right ] } & =U_{i,j}^{\left [ k\right ] }+\omega \left ( U_{(gs)i,j}^{\left [ k+1\right ] }-U_{i,j}^{\left [ k\right ] }\right ) \end{align*}

Hence, again, the second equation above becomes\begin{align*} U_{(sor)i,j}^{\left [ k+1\right ] } & =U_{i,j}^{\left [ k\right ] }+\omega \left ( \frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}^{\left [ k+1\right ] }h_{y}^{2}+U_{i+1,j}^{\left [ k\right ] }h_{y}^{2}+U_{i,j-1}^{\left [ k+1\right ] }h_{x}^{2}+U_{i,j+1}^{\left [ k\right ] }h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}-U_{i,j}^{\left [ k\right ] }\right ) \\ & =\omega \left [ \frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}^{\left [ k+1\right ] }h_{y}^{2}+U_{i+1,j}^{\left [ k\right ] }h_{y}^{2}+U_{i,j-1}^{\left [ k+1\right ] }h_{x}^{2}+U_{i,j+1}^{\left [ k\right ] }h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\right ] +\left ( 1-\omega \right ) U_{i,j}^{\left [ k\right ] } \end{align*}

Hence, for non-uniforma mesh the above update table becomes



method Formula for updating \(U_{i,j}\), non-uniform mesh




Jacobi \(U_{i,j}^{\left [ k+1\right ] }=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \left ( U_{i-1,j}+U_{i+1,j}\right ) h_{y}^{2}+\left ( U_{i,j-1}+U_{i,j+1}\right ) h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\)


GS \(U_{i,j}^{\left [ k+1\right ] }=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}^{\left [ k+1\right ] }h_{y}^{2}+U_{i+1,j}^{\left [ k\right ] }h_{y}^{2}+U_{i,j-1}^{\left [ k+1\right ] }h_{x}^{2}+U_{i,j+1}^{\left [ k\right ] }h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\)


SOR \(U_{i,j}^{\left [ k+1\right ] }=\omega \left [ \frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i-1,j}^{\left [ k+1\right ] }h_{y}^{2}+U_{i+1,j}^{\left [ k\right ] }h_{y}^{2}+U_{i,j-1}^{\left [ k+1\right ] }h_{x}^{2}+U_{i,j+1}^{\left [ k\right ] }h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\right ] +\left ( 1-\omega \right ) U_{i,j}^{\left [ k\right ] }\)


The residual formula is \(R^{\left [ k\right ] }=f-Au^{[k]}\), which can be written using the above notations as\[ R_{i,j}=f_{i,j}-\left ( \frac{U_{i-1,j}-2U_{i,j}+U_{i+1,j}}{h_{x}^{2}}+\frac{U_{i,j-1}-2U_{i,j}+U_{i,j+1}}{h_{y}^{2}}\right ) \] and for a uniform mesh the above becomes\[ R_{i,j}=f_{i,j}-\frac{1}{h^{2}}\left ( U_{i-1,j}+U_{i+1,j}+U_{i,j-1}+U_{i,j+1}-4U_{i,j}\right ) \]

2.4.4.2 Tolerance used

The tolerance \(\epsilon \) used is based on the global error in the numerical solution. For this problem \(\left \Vert e\right \Vert =\) \(Ch^{2}\) where \(C\) is a constant \(C.\) Two different values for the constant were tried in the implementation: \(0.1\) and \(1\).

2.4.4.3 stopping criteria

The stopping criteria used was based on the use of the relative residual.  Given\[ R^{\left [ k\right ] }=f-Au^{[k]}\] The iterative process was stopped when the following condition was satisfied \[ \frac{\left \Vert R\left [ ^{k}\right ] \right \Vert }{\left \Vert f\right \Vert }<\epsilon \] Other possible stopping criterion are (but were not used) are

1.
Absolute error: convergence achieved when \(\left \Vert u^{\left [ k+1\right ] }-u^{\left [ k\right ] }\right \Vert <\epsilon \)
2.
Relative error: convergence achieved when \(\frac{\left \Vert u^{\left [ k+1\right ] }-u^{\left [ k\right ] }\right \Vert }{\left \Vert u^{\left [ k\right ] }\right \Vert }<\epsilon \)

The above two criterion are not used as they do not perform as well for Jacobi and Gauss-Seidel.

2.4.4.4 \(\omega \) used for SOR

The \(\omega _{opt}\) used is based on the relation to the spectral radius of the \(Jacobi\) iteration matrix. This relation was derived in class\[ \omega _{opt}=\frac{2}{1+\sqrt{1-\rho _{Jacobian}^{2}}}\] where for the \(2D\) Poisson problem \(\rho _{Jacobian}\) is given as \(\cos \left ( \pi h\right ) \) where \(h\) is the grid spacing. Using this in the above and approximating for small \(h\) results in\[ \omega _{opt}\approx 2\left ( 1-\pi h\right ) \] For the given \(h\) values in the problem, the following values were used for \(\omega _{opt}\)



\(h_{x}=h_{y}=h\) \(\omega _{opt}\ \)




\(2^{-3}\) \(1.214\,6\)


\(2^{-4}\) \(1.607\,3\)


\(2^{-5}\) \(1.803\,7\)


\(2^{-6}\) \(1.901\,8\)


\(2^{-7}\) \(1.950\,9\)


Notice that the above approximation formula is valid for small \(h\) and will not result in good convergence for SOR if used for large \(h\).

Update 12/29/2010: After more analysis, the following formula is found to produce better results

\begin{align*} t & =cos(\pi \ h_{x})+cos(\pi h_{y})\\ poly(x) & =x^{2}t^{2}-16x+16 \end{align*}

Solve the above polynomial for \(x\) and then take the smaller root. This will be \(\omega _{opt}\), Using this results in



\(h_{x}=h_{y}=h\) \(\omega _{opt}\ \)




\(2^{-3}\) \(1.4464\)


\(2^{-4}\) \(1.6763\)


\(2^{-5}\) \(1.821\)


\(2^{-6}\) \(1.9064\)


\(2^{-7}\) \(1.9509\)


2.4.4.5 Algorithm details

In this section, some of the details of the implementation are described for reference. The Matrix \(A\) is not used directly in the iterative method, but its structure is briefly described.

In the discussion below, updating the grid is described for the Jacobi method, showing how the new values of the dependent variable \(u\) are calculated.

Numbering system and grid updating

The numbering system used for the grid is the one described in class. The indices for the unknown \(u_{i,j}\) are numbered row wise, left to right, bottom to top. This follows the standard Cartesian coordinates system. The reason for this, is to allow the use of the standard formula for the 5 point Laplacian. The following diagram illustrate the 5 point Laplacian borrowed from the text book, figure 5, page 61 ”Finite Difference Methods for Ordinary and Partial Differential Equations” by Randall J. LeVeque

pict
Figure 2.18:5-point stencile for Laplacian at x(i,j)

Lower case \(n\) is used to indicate the number of unknowns along one dimension, and upper case \(N\) is used to indicate the total number of unknowns.

pict
Figure 2.19:stencil move

In the diagram above, \(n=3\) is the number of unknowns on each one row or one column, and since there are \(3\) internal rows, there will be \(9\) unknowns, all are located on internal grid points. There are a total of \(25\) grid points, \(16\) of which are on the boundaries and \(9\) internal.

To update the grid, the 5 point Laplacian is centered at each internal grid point and then moved left to right, bottom to top, each time an updated value of the unknown is generated and stored in an auxiliary grid.

In the Jacobian method, the new value at the location \(x_{i,j}\) is not used in the calculation to update its neighbors when the stencil is moved, but

Only when the whole grid has been swept by the Laplacian will the grid be updated by copying the content of the auxiliary grid into it.

In the Gauss-Seidel and SOR, this not the case. Once a grid point have been updated, its new value is used immediately in the update of its neighbor as the Laplacian sweeps the grid. No auxiliary grid is required. In other words, the updates happens ’in place’.

Continuing this example, the following diagram shows how each grid point is updated (In  a parallel computation, these operations can all be done at once for the Jacobi solver, but not for the Gauss-Seidel or the SOR solver, unless different numbering scheme is used such as black-red numbering).

pict
Figure 2.20:stencil move 2

Now that the grid has been updated once, the process is repeated again. This process is continued until convergence is achieved.

2.4.4.6 Structure of \(Au=f\)

The structure of \(Au=f\) system matrix is now described. As an example, for number of unknowns = 9  the following characteristics of the matrix \(A\) can be seen

pict
Figure 2.21:A structure

Structure of the \(A\) matrix for elliptic 2D PDE with Dirchillet boundary conditions for non-uniform mesh

This section was added at a later time here for completion, and is not required for the HW. Below the \(A\) matrix form is derived for the case for non-uniform mesh (this means \(h_{x}\) is not neccessarly the same as \(h_{y}\)) and also, the number of grid points in the x-direction is not the same as the number of grid points in the y-direction.

To ease the derivation, we will use a \(5\times 5\) grid as an example, hence this will generate \(9\) equations as there are \(9\) unknowns\(.\) From this derivation we will be able to see the form of the \(A\) matrix.

In addition, in this derivation, we will use \(i\) to represent row number, and \(j\) to represent column number, as this more closely matches the convention.

pict
Figure 2.22:new grid

The unknowns are shown above in the circles. From earlier, we found that the discretization for the elliptic PDE at a node is\[ U_{i,j}=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i,j-1}h_{y}^{2}+U_{i,j+1}h_{y}^{2}+U_{i-1,j}h_{x}^{2}+U_{i+1,j}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\] We will now write the 9 equations for each node, starting with \(\left ( 2,2\right ) \) to node \(\left ( 4,4\right ) \)\begin{align*} U_{2,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,1}h_{y}^{2}+U_{2,3}h_{y}^{2}+U_{1,2}h_{x}^{2}+U_{3,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,2}\\ U_{2,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,2}h_{y}^{2}+U_{2,4}h_{y}^{2}+U_{1,3}h_{x}^{2}+U_{3,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,3}\\ U_{2,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,3}h_{y}^{2}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}+U_{3,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,4}\\ U_{3,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,1}h_{y}^{2}+U_{3,3}h_{y}^{2}+U_{2,2}h_{x}^{2}+U_{4,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,2}\\ U_{3,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,2}h_{y}^{2}+U_{3,4}h_{y}^{2}+U_{2,3}h_{x}^{2}+U_{4,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,3}\\ U_{3,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,3}h_{y}^{2}+U_{3,5}h_{y}^{2}+U_{2,4}h_{x}^{2}+U_{4,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{4,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,1}h_{y}^{2}+U_{4,3}h_{y}^{2}+U_{3,2}h_{x}^{2}+U_{5,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,2}\\ U_{4,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,2}h_{y}^{2}+U_{4,4}h_{y}^{2}+U_{3,3}h_{x}^{2}+U_{5,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,3}\\ U_{4,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,3}h_{y}^{2}+U_{4,5}h_{y}^{2}+U_{3,4}h_{x}^{2}+U_{5,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,4} \end{align*}

Now, moving the knowns to the right, results in\begin{align*} 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,2}-U_{2,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,3}-U_{2,2}h_{y}^{2}-U_{2,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,4}-U_{2,3}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,4}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,2}-U_{3,3}h_{y}^{2}-U_{2,2}h_{x}^{2}-U_{4,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,3}-U_{3,2}h_{y}^{2}-U_{3,4}h_{y}^{2}-U_{2,3}h_{x}^{2}-U_{4,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,3}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,4}-U_{3,3}h_{y}^{2}-U_{2,4}h_{x}^{2}-U_{4,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,5}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,2}-U_{4,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,3}-U_{4,2}h_{y}^{2}-U_{4,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,4}-U_{4,3}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,4}+U_{4,5}h_{y}^{2}+U_{5,4}h_{x}^{2} \end{align*}

Now, we can write \(Ax=b\) as, letting \(\beta =2\left ( h_{x}^{2}+h_{y}^{2}\right ) \)\[\begin{pmatrix} \beta & -h_{y}^{2} & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0\\ -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0\\ 0 & -h_{y}^{2} & \beta & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0\\ -h_{x}^{2} & 0 & 0 & \beta & -h_{y}^{2} & 0 & -h_{x}^{2} & 0 & 0\\ 0 & -h_{x}^{2} & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & -h_{x}^{2} & 0\\ 0 & 0 & -h_{x}^{2} & 0 & -h_{y}^{2} & \beta & 0 & 0 & -h_{x}^{2}\\ 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & \beta & -h_{y}^{2} & 0\\ 0 & 0 & 0 & 0 & -h_{y}^{2} & 0 & -h_{y}^{2} & \beta & -h_{y}^{2}\\ 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & -h_{y}^{2} & \beta \end{pmatrix}\begin{pmatrix} U_{22}\\ U_{23}\\ U_{24}\\ U_{32}\\ U_{33}\\ U_{34}\\ U_{42}\\ U_{43}\\ U_{44}\end{pmatrix} =\begin{pmatrix} -h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,4}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,3}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,5}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,4}+U_{4,5}h_{y}^{2}+U_{5,4}h_{x}^{2}\end{pmatrix} \] Now we will do another case \(n_{x}\) \(\neq n_{y}\)

pict
Figure 2.23:new grid 2

The unknowns are shown above in the circles. From earlier, we found that the discretization for the elliptic PDE at a node is\[ U_{i,j}=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i,j-1}h_{y}^{2}+U_{i,j+1}h_{y}^{2}+U_{i-1,j}h_{x}^{2}+U_{i+1,j}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\] We will now write the 12 equations for each node, starting with \(\left ( 2,2\right ) \) to node \(\left ( 4,5\right ) \)\begin{align*} U_{2,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,1}h_{y}^{2}+U_{2,3}h_{y}^{2}+U_{1,2}h_{x}^{2}+U_{3,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,2}\\ U_{2,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,2}h_{y}^{2}+U_{2,4}h_{y}^{2}+U_{1,3}h_{x}^{2}+U_{3,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,3}\\ U_{2,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,3}h_{y}^{2}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}+U_{3,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,4}\\ U_{2,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,4}h_{y}^{2}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}+U_{3,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,5}\\ U_{3,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,1}h_{y}^{2}+U_{3,3}h_{y}^{2}+U_{2,2}h_{x}^{2}+U_{4,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,2}\\ U_{3,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,2}h_{y}^{2}+U_{3,4}h_{y}^{2}+U_{2,3}h_{x}^{2}+U_{4,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,3}\\ U_{3,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,3}h_{y}^{2}+U_{3,5}h_{y}^{2}+U_{2,4}h_{x}^{2}+U_{4,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{3,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,4}h_{y}^{2}+U_{3,6}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{4,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,1}h_{y}^{2}+U_{4,3}h_{y}^{2}+U_{3,2}h_{x}^{2}+U_{5,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,2}\\ U_{4,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,2}h_{y}^{2}+U_{4,4}h_{y}^{2}+U_{3,3}h_{x}^{2}+U_{5,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,3}\\ U_{4,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,3}h_{y}^{2}+U_{4,5}h_{y}^{2}+U_{3,4}h_{x}^{2}+U_{5,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,4}\\ U_{4,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,4}h_{y}^{2}+U_{4,6}h_{y}^{2}+U_{3,5}h_{x}^{2}+U_{5,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,5} \end{align*}

Now, moving the knowns to the right, results in\begin{align*} 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,2}-U_{2,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,3}-U_{2,2}h_{y}^{2}-U_{2,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,4}-U_{2,3}h_{y}^{2}-U_{2,5}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,4}+U_{1,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,5}-U_{2,4}h_{y}^{2}+U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,2}-U_{3,3}h_{y}^{2}-U_{2,2}h_{x}^{2}-U_{4,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,3}-U_{3,2}h_{y}^{2}-U_{3,4}h_{y}^{2}-U_{2,3}h_{x}^{2}-U_{4,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,3}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,4}-U_{3,3}h_{y}^{2}-U_{3,5}h_{y}^{2}-U_{2,4}h_{x}^{2}-U_{4,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,5}-U_{3,4}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,2}-U_{4,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,3}-U_{4,2}h_{y}^{2}-U_{4,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,4}-U_{4,5}h_{y}^{2}-U_{4,3}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,4}+U_{5,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,5}-U_{4,4}h_{y}^{2}-U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2} \end{align*}

Now, we can write \(Ax=b\) as, letting \(\beta =2\left ( h_{x}^{2}+h_{y}^{2}\right ) \)\[\begin{pmatrix} \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0\\ -h_{x}^{2} & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0\\ 0 & -h_{x}^{2} & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0 & 0\\ 0 & 0 & -h_{x}^{2} & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & -h_{x}^{2} & 0\\ 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & -h_{x}^{2}\\ 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & -h_{y}^{2} & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2}\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & -h_{y}^{2} & \beta \\ & & & & & & & & & & & \end{pmatrix}\begin{pmatrix} U_{22}\\ U_{23}\\ U_{24}\\ U_{25}\\ U_{32}\\ U_{33}\\ U_{34}\\ U_{35}\\ U_{42}\\ U_{43}\\ U_{44}\\ U_{45}\end{pmatrix} =\begin{pmatrix} -h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,4}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,3}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,5}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,4}+U_{4,5}h_{y}^{2}+U_{5,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2}\end{pmatrix} \]

To create the above matrix, as sparse matrix, call the following Matlab code

A matrix format for sparse storage for elliptic 2D with non-homegenous Nuemman boundary conditions

This section was added at later time, and not required for this HW.

To be able to forumlate the A matrix as sparse matrix, we need to find what the form will be in the case when one or more of the boundary conditions has Nuemman coditions on it. We will start as above, and start with assuming Nuemman conditions now exist on the left edge and find out what the form of the A matrix will turn out to be.

pict
Figure 2.24:new grid 2 nuemman on left

The unknowns are shown above in the circles. From earlier, we found that the discretization for the elliptic PDE at an internal node is (note that in the above, \(i\) is the row number, and \(j\) is column number)\[ U_{i,j}=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{i,j-1}h_{y}^{2}+U_{i,j+1}h_{y}^{2}+U_{i-1,j}h_{x}^{2}+U_{i+1,j}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,j}\] And for Nuemman, non-homegenouse boundary conditions, the left edge unknowns are given by\begin{equation} U_{i,1}=\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( 2U_{i,2}\ h_{y}^{2}+\left ( U_{i-1,1}+U_{i+1,1}\right ) h_{x}^{2}-h_{y}^{2}h_{x}g_{i,1}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,1}\nonumber \end{equation} We will now write the 15 equations for each node, starting with \(\left ( 2,1\right ) \) to node \(\left ( 4,5\right ) \)\begin{align*} U_{2,1} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( 2U_{2,2}\ h_{y}^{2}+\left ( U_{1,1}+U_{3,1}\right ) h_{x}^{2}-h_{y}^{2}h_{x}g_{2,1}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,1}\\ U_{2,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,1}h_{y}^{2}+U_{2,3}h_{y}^{2}+U_{1,2}h_{x}^{2}+U_{3,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,2}\\ U_{2,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,2}h_{y}^{2}+U_{2,4}h_{y}^{2}+U_{1,3}h_{x}^{2}+U_{3,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,3}\\ U_{2,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,3}h_{y}^{2}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}+U_{3,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,4}\\ U_{2,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,4}h_{y}^{2}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}+U_{3,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,5}\\ U_{3,1} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( 2U_{3,2}\ h_{y}^{2}+\left ( U_{2,1}+U_{4,1}\right ) h_{x}^{2}-h_{y}^{2}h_{x}g_{3,1}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,1}\\ U_{3,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,1}h_{y}^{2}+U_{3,3}h_{y}^{2}+U_{2,2}h_{x}^{2}+U_{4,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,2}\\ U_{3,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,2}h_{y}^{2}+U_{3,4}h_{y}^{2}+U_{2,3}h_{x}^{2}+U_{4,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,3}\\ U_{3,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,3}h_{y}^{2}+U_{3,5}h_{y}^{2}+U_{2,4}h_{x}^{2}+U_{4,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{3,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,4}h_{y}^{2}+U_{3,6}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{4,1} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( 2U_{4,2}\ h_{y}^{2}+\left ( U_{3,1}+U_{5,1}\right ) h_{x}^{2}-h_{y}^{2}h_{x}g_{4,1}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,1}\\ U_{4,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,1}h_{y}^{2}+U_{4,3}h_{y}^{2}+U_{3,2}h_{x}^{2}+U_{5,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,2}\\ U_{4,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,2}h_{y}^{2}+U_{4,4}h_{y}^{2}+U_{3,3}h_{x}^{2}+U_{5,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,3}\\ U_{4,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,3}h_{y}^{2}+U_{4,5}h_{y}^{2}+U_{3,4}h_{x}^{2}+U_{5,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,4}\\ U_{4,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,4}h_{y}^{2}+U_{4,6}h_{y}^{2}+U_{3,5}h_{x}^{2}+U_{5,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,5} \end{align*}

Now, moving the knowns to the right, results in\begin{align*} 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,1}-2U_{2,2}\ h_{y}^{2}-U_{3,1}h_{x}^{2} & =U_{1,1}h_{x}^{2}-h_{y}^{2}h_{x}g_{2,1}-h_{x}^{2}h_{y}^{2}f_{2,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,2}-U_{2,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,3}-U_{2,2}h_{y}^{2}-U_{2,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,4}-U_{2,3}h_{y}^{2}-U_{2,5}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,4}+U_{1,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,5}-U_{2,4}h_{y}^{2}+U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,1}-2U_{3,2}\ h_{y}^{2}-U_{4,1}h_{x}^{2}-U_{2,1}h_{x}^{2} & =-h_{y}^{2}h_{x}g_{3,1}-h_{x}^{2}h_{y}^{2}f_{3,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,2}-U_{3,3}h_{y}^{2}-U_{2,2}h_{x}^{2}-U_{4,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,3}-U_{3,2}h_{y}^{2}-U_{3,4}h_{y}^{2}-U_{2,3}h_{x}^{2}-U_{4,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,3}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,4}-U_{3,3}h_{y}^{2}-U_{3,5}h_{y}^{2}-U_{2,4}h_{x}^{2}-U_{4,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,5}-U_{3,4}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,1}-2U_{4,2}\ h_{y}^{2}-U_{3,1}h_{x}^{2} & =U_{5,1}h_{x}^{2}-h_{y}^{2}h_{x}g_{4,1}-h_{x}^{2}h_{y}^{2}f_{4,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,2}-U_{4,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,3}-U_{4,2}h_{y}^{2}-U_{4,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,4}-U_{4,5}h_{y}^{2}-U_{4,3}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,4}+U_{5,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,5}-U_{4,4}h_{y}^{2}-U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2} \end{align*}

Now, we can write \(Ax=b\) as, letting \(\beta =2\left ( h_{x}^{2}+h_{y}^{2}\right ) \)\[\begin{pmatrix} \beta & -2h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0\\ -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -2h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0\\ 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0\\ 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0\\ 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0\\ 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & 0 & -h_{x}^{2}\\ 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -2h_{y}^{2} & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{y}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2}\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta \end{pmatrix}\begin{pmatrix} U_{21}\\ U_{22}\\ U_{23}\\ U_{24}\\ U_{25}\\ U_{31}\\ U_{32}\\ U_{33}\\ U_{34}\\ U_{35}\\ U_{41}\\ U_{42}\\ U_{43}\\ U_{44}\\ U_{45}\end{pmatrix} =\begin{pmatrix} U_{1,1}h_{x}^{2}-h_{y}^{2}h_{x}g_{2,1}-h_{x}^{2}h_{y}^{2}f_{2,1}\\ -h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,4}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ -h_{y}^{2}h_{x}g_{3,1}-h_{x}^{2}h_{y}^{2}f_{3,1}\\ -h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,3}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,5}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ U_{5,1}h_{x}^{2}-h_{y}^{2}h_{x}g_{4,1}-h_{x}^{2}h_{y}^{2}f_{4,1}\\ -h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,4}+U_{4,5}h_{y}^{2}+U_{5,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2}\end{pmatrix} \] The above is the matrix for the case of non-homogeneous Neumann boundary conditions on the left edge.

We will now do the same edge, but with homogeneous boundary conditions to see the difference. Recall, that when the edge is insulated, then\[ U_{i,1}=\frac{h_{x}h_{y}}{\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \frac{h_{x}}{2h_{y}}\left ( U_{i-1,1}+U_{i+1,1}\right ) -\frac{h_{y}}{h_{x}}U_{i,2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{i,1}\] Using the above, we write the 15 equations starting with \(\left ( 2,1\right ) \) to node \(\left ( 4,5\right ) \)\begin{align*} U_{2,1} & =\frac{h_{x}h_{y}}{\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \frac{h_{x}}{2h_{y}}\left ( U_{1,1}+U_{3,1}\right ) -\frac{h_{y}}{h_{x}}U_{2,2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,1}\\ U_{2,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,1}h_{y}^{2}+U_{2,3}h_{y}^{2}+U_{1,2}h_{x}^{2}+U_{3,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,2}\\ U_{2,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,2}h_{y}^{2}+U_{2,4}h_{y}^{2}+U_{1,3}h_{x}^{2}+U_{3,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,3}\\ U_{2,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,3}h_{y}^{2}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}+U_{3,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,4}\\ U_{2,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{2,4}h_{y}^{2}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}+U_{3,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{2,5}\\ U_{3,1} & =\frac{h_{x}h_{y}}{\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \frac{h_{x}}{2h_{y}}\left ( U_{2,1}+U_{4,1}\right ) -\frac{h_{y}}{h_{x}}U_{3,2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,1}\\ U_{3,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,1}h_{y}^{2}+U_{3,3}h_{y}^{2}+U_{2,2}h_{x}^{2}+U_{4,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,2}\\ U_{3,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,2}h_{y}^{2}+U_{3,4}h_{y}^{2}+U_{2,3}h_{x}^{2}+U_{4,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,3}\\ U_{3,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,3}h_{y}^{2}+U_{3,5}h_{y}^{2}+U_{2,4}h_{x}^{2}+U_{4,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{3,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{3,4}h_{y}^{2}+U_{3,6}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{3,4}\\ U_{4,1} & =\frac{h_{x}h_{y}}{\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( \frac{h_{x}}{2h_{y}}\left ( U_{3,1}+U_{5,1}\right ) -\frac{h_{y}}{h_{x}}U_{4,2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,1}\\ U_{4,2} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,1}h_{y}^{2}+U_{4,3}h_{y}^{2}+U_{3,2}h_{x}^{2}+U_{5,2}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,2}\\ U_{4,3} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,2}h_{y}^{2}+U_{4,4}h_{y}^{2}+U_{3,3}h_{x}^{2}+U_{5,3}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,3}\\ U_{4,4} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,3}h_{y}^{2}+U_{4,5}h_{y}^{2}+U_{3,4}h_{x}^{2}+U_{5,4}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,4}\\ U_{4,5} & =\frac{1}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }\left ( U_{4,4}h_{y}^{2}+U_{4,6}h_{y}^{2}+U_{3,5}h_{x}^{2}+U_{5,5}h_{x}^{2}\right ) -\frac{h_{x}^{2}h_{y}^{2}}{2\left ( h_{x}^{2}+h_{y}^{2}\right ) }f_{4,5} \end{align*}

Now, moving the knowns to the right, results in\begin{align*} \frac{\left ( h_{x}^{2}+h_{y}^{2}\right ) }{h_{x}h_{y}}U_{2,1}-\frac{h_{x}}{2h_{y}}U_{3,1}+\frac{h_{y}}{h_{x}}U_{2,2} & =\frac{h_{x}}{2h_{y}}U_{1,1}-h_{x}h_{y}f_{2,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,2}-U_{2,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,3}-U_{2,2}h_{y}^{2}-U_{2,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,4}-U_{2,3}h_{y}^{2}-U_{2,5}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,4}+U_{1,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{2,5}-U_{2,4}h_{y}^{2}+U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ \frac{\left ( h_{x}^{2}+h_{y}^{2}\right ) }{h_{x}h_{y}}U_{3,1}-\frac{h_{x}}{2h_{y}}U_{2,1}-\frac{h_{x}}{2h_{y}}U_{4,1}+\frac{h_{y}}{h_{x}}U_{3,2} & =-h_{x}h_{y}f_{3,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,2}-U_{3,3}h_{y}^{2}-U_{2,2}h_{x}^{2}-U_{4,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,3}-U_{3,2}h_{y}^{2}-U_{3,4}h_{y}^{2}-U_{2,3}h_{x}^{2}-U_{4,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,3}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,4}-U_{3,3}h_{y}^{2}-U_{3,5}h_{y}^{2}-U_{2,4}h_{x}^{2}-U_{4,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{3,5}-U_{3,4}h_{y}^{2}+U_{2,5}h_{x}^{2}+U_{4,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ \frac{\left ( h_{x}^{2}+h_{y}^{2}\right ) }{h_{x}h_{y}}U_{4,1}-\frac{h_{x}}{2h_{y}}U_{3,1}+\frac{h_{y}}{h_{x}}U_{4,2} & =\frac{h_{x}}{2h_{y}}U_{5,1}-h_{x}h_{y}f_{4,1}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,2}-U_{4,3}h_{y}^{2}-U_{3,2}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,3}-U_{4,2}h_{y}^{2}-U_{4,4}h_{y}^{2}-U_{3,3}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,4}-U_{4,5}h_{y}^{2}-U_{4,3}h_{y}^{2}-U_{3,4}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,4}+U_{5,4}h_{x}^{2}\\ 2\left ( h_{x}^{2}+h_{y}^{2}\right ) U_{4,5}-U_{4,4}h_{y}^{2}-U_{3,5}h_{x}^{2} & =-h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2} \end{align*}

Now, we can write \(Ax=b\) as, letting \(\beta =2\left ( h_{x}^{2}+h_{y}^{2}\right ) \) and \(\gamma =\frac{\left ( h_{x}^{2}+h_{y}^{2}\right ) }{h_{x}h_{y}}\)\[\begin{pmatrix} \gamma & \frac{h_{y}}{h_{x}} & 0 & 0 & 0 & -\frac{h_{x}}{2h_{y}} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & 0\\ -\frac{h_{x}}{2h_{y}} & 0 & 0 & 0 & 0 & \gamma & \frac{h_{y}}{h_{x}} & 0 & 0 & 0 & -\frac{h_{x}}{2h_{y}} & 0 & 0 & 0 & 0\\ 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0\\ 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0\\ 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0 & 0 & 0 & -h_{x}^{2} & 0\\ 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & 0 & 0 & 0 & 0 & -h_{x}^{2}\\ 0 & 0 & 0 & 0 & 0 & -\frac{h_{x}}{2h_{y}} & 0 & 0 & 0 & 0 & \gamma & \frac{h_{y}}{h_{x}} & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & 0 & \beta & -h_{y}^{2} & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{y}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2} & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta & -h_{y}^{2}\\ 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & -h_{x}^{2} & 0 & 0 & 0 & -h_{y}^{2} & \beta \end{pmatrix}\begin{pmatrix} U_{21}\\ U_{22}\\ U_{23}\\ U_{24}\\ U_{25}\\ U_{31}\\ U_{32}\\ U_{33}\\ U_{34}\\ U_{35}\\ U_{41}\\ U_{42}\\ U_{43}\\ U_{44}\\ U_{45}\end{pmatrix} =\begin{pmatrix} \frac{h_{x}}{2h_{y}}U_{1,1}-h_{x}h_{y}f_{2,1}\\ -h_{x}^{2}h_{y}^{2}f_{2,2}+U_{2,1}h_{y}^{2}+U_{1,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,3}+U_{1,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,4}+U_{2,5}h_{y}^{2}+U_{1,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{2,5}+U_{2,6}h_{y}^{2}+U_{1,5}h_{x}^{2}\\ -h_{x}h_{y}f_{3,1}\\ -h_{x}^{2}h_{y}^{2}f_{3,2}+U_{3,1}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,3}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,5}h_{y}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{3,4}+U_{3,6}h_{y}^{2}\\ \frac{h_{x}}{2h_{y}}U_{5,1}-h_{x}h_{y}f_{4,1}\\ -h_{x}^{2}h_{y}^{2}f_{4,2}+U_{4,1}h_{y}^{2}+U_{5,2}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,3}+U_{5,3}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,4}+U_{4,5}h_{y}^{2}+U_{5,4}h_{x}^{2}\\ -h_{x}^{2}h_{y}^{2}f_{4,5}+U_{4,6}h_{y}^{2}+U_{5,5}h_{x}^{2}\end{pmatrix} \] Storage requirements for the different solvers

The total number of grid points along the \(x\) or the \(y\) direction is given by \(\frac{length}{h}+1\), where length is always \(1\), and hence \(n\), the number of unknowns in one dimension is \(2\) less than the above number (since \(U\) is known at the boundaries).

It is important to note that the matrix \(A\) is not used explicitly to solve for the unknowns in the iterative schemes. Storage is needed only for the internal grid points, which is the number of the unknowns \(n^{2}.\) An auxiliary grid is used to hold the updated values in the Jacobian method. In addition, an auxiliary grid is required for \(f_{i,j}\), the force function. Hence, in total \(3n^{2}\) storage is needed.

Comparing this to the storage needed in the case of direct solver, where the storage for \(A\) alone is \(n^{4}.\) This shows the main advantage of the iterative methods compared to the direct method when \(A\) is a dense matrix. (Use of sparse matrix become necessary if direct solver is to be used for large \(n\) problems).

The following table summarizes the above discussion for the \(h\) values given in this problem. In this calculations, double precision (8 bytes) per grid point is assumed. For single precision  half of this storage would be needed.







\(h\) \(n\) number of  storage size of\(\ A\) storage
unknowns\(\ n^{2}\) \(n^{4}\) for dense\(\ A\)












\(2^{-5}\) \(30\) \(900\) \(30\ \)(k Bytes) \(810\,,000\) \(6.4\) MBytes






\(2^{-6}\) \(62\) \(3844\) \(0.1\) (MB) \(14,776\,,336\) \(118\) MBytes






\(2^{-7}\) \(126\) \(15\,876\) \(0.5\) (MB) \(252\,,047,376\) \(2\) GB






Loop algorithm for each method

This is a short description of the algorithm used by each method. In the following \(u_{i,j}^{new}\) represents the new value (residing on the auxiliary grid) of the unknown, and \(u_{i,j}^{current}\) is the current value. Initially \(u^{current}\) is set to zero at each internal grid point. (any other initial value could also have been used).

Jacobi method algorithm

\[\begin{array} [c]{l}k:=0\ (\ast counter\ast )\\ \epsilon :=Ch^{2}\ (\ast tolerance\ast )\\ u^{current}:=u^{new}:=0\ (\ast initialize\ storage\ast )\\ f:=f(x,y)\ (\ast initialize\ f\ on\ grid\ points\ast )\\ CONVERGED\ :=false\\ WHILE\ NOT\ CONVERGED\ \\ \ \ \ \ \ \ \ LOOP\ i\\ \ \ \ \ \ \ \ \ \ \ \ \ \ LOOP\ j\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ u_{i,j}^{new}:=\frac{1}{4}\left ( u_{i-1,j}^{current}+u_{i+1,j}^{current}+u_{i,j-1}^{current}+u_{i,j+1}^{current}-h^{2}f_{i,j}\right ) \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ R_{i,j}=f_{i,j}-\frac{1}{h^{2}}\left ( u_{i-1,j}^{current}+u_{i+1,j}^{current}+u_{i,j-1}^{current}+u_{i,j+1}^{current}-4u_{i,j}^{current}\right ) \ \ \ \ \ (\ast residual\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ END\ LOOP\ j\\ \ \ \ \ \ \ \ END\ LOOP\ i\\ \ \ \ \ \ \ \ u^{current}:=u^{new}\ (\ast update\ast )\\ \ \ \ \ \ \ \ k:=k+1\\ \ \ \ \ \ \ \ IF\ \ \ \ \left ( \frac{\left \Vert R\right \Vert }{\left \Vert f\right \Vert }<\epsilon \ \right ) \ THEN\ \ (\ast Norms\ are\ grid\ norms.\ see\ code\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ CONVERGED\ :=true\ \ \ \\ \ \ \ \ \ \ \ END\ IF\\ END\ WHILE\ \end{array} \]

Gauss-Seidel method

\[\begin{array} [c]{l}k:=0\ (\ast counter\ast )\\ \epsilon :=Ch^{2}\ (\ast tolerance\ast )\\ u^{current}:=u^{new}:=0\ (\ast initialize\ storage\ast )\\ f:=f(x,y)\ (\ast initialize\ f\ on\ grid\ points\ast )\\ CONVERGED\ :=false\\ WHILE\ NOT\ CONVERGED\ \\ \ \ \ \ \ \ \ LOOP\ i\\ \ \ \ \ \ \ \ \ \ \ \ \ \ LOOP\ j\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ u_{ij}^{new}:=\frac{1}{4}\left ( u_{i-1,j}^{new}+u_{i+1,j}^{current}+u_{i,j-1}^{new}+u_{i,j+1}^{current}-h^{2}f_{i,j}\right ) \\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ R_{i,j}=f_{i,j}-\frac{1}{h^{2}}\left ( u_{i-1,j}^{current}+u_{i+1,j}^{current}+u_{i,j-1}^{current}+u_{i,j+1}^{current}-4u_{i,j}^{current}\right ) \ \ \ \ \ (\ast residual\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ END\ LOOP\ j\\ \ \ \ \ \ \ \ END\ LOOP\ i\\ \ \ \ \ \ \ \ u^{current}:=u^{new}\ (\ast update\ast )\\ \ \ \ \ \ \ \ k:=k+1\\ \ \ \ \ \ \ \ IF\ \ \ \ \left ( \frac{\left \Vert R\right \Vert }{\left \Vert f\right \Vert }<\epsilon \ \right ) \ THEN\ \ \ (\ast Norms\ are\ grid\ norms.\ see\ code\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ CONVERGED\ :=true\ \ \ \\ \ \ \ \ \ \ \ END\ IF\\ END\ WHILE\ \end{array} \]

SOR method

\[\begin{array} [c]{l}k:=0\ (\ast counter\ast )\\ \epsilon :=Ch^{2}\ (\ast tolerance\ast )\\ u^{current}:=u^{new}:=0\ (\ast initialize\ storage\ast )\\ f:=f(x,y)\ (\ast initialize\ f\ on\ grid\ points\ast )\\ CONVERGED\ :=false\\ WHILE\ NOT\ CONVERGED\ \\ \ \ \ \ \ \ \ LOOP\ i\\ \ \ \ \ \ \ \ \ \ \ \ \ \ LOOP\ j\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ u_{ij}^{new}:=\frac{\omega }{4}\left ( u_{i-1,j}^{new}+u_{i+1,j}^{current}+u_{i,j-1}^{new}+u_{i,j+1}^{current}-h^{2}f_{i,j}\right ) +\left ( 1-\omega \right ) u_{i,j}^{current}\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ R_{i,j}=f_{i,j}-\frac{1}{h^{2}}\left ( u_{i-1,j}^{current}+u_{i+1,j}^{current}+u_{i,j-1}^{current}+u_{i,j+1}^{current}-4u_{i,j}^{current}\right ) \ \ \ \ (\ast residual\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ END\ LOOP\ j\\ \ \ \ \ \ \ \ END\ LOOP\ i\\ \ \ \ \ \ \ \ u^{current}:=u^{new}\ (\ast update\ast )\\ \ \ \ \ \ \ \ k:=k+1\\ \ \ \ \ \ \ \ IF\ \ \ \ \left ( \frac{\left \Vert R\right \Vert }{\left \Vert f\right \Vert }<\epsilon \ \right ) \ THEN\ \ \ (\ast Norms\ are\ grid\ norms.\ see\ code\ast )\\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ CONVERGED\ :=true\ \ \ \\ \ \ \ \ \ \ \ END\ IF\\ END\ WHILE\ \end{array} \] Notice that when \(\omega =1\), SOR becomes the same as \(Gauss-Seidel\). When \(\omega >1\) the method is called overrelaxed and when \(\omega <1\) it is called underrelaxed.

2.4.4.7 Result of computation

The above 3 algorithms where implemented and ran for the 3 values of \(h\) given. The following tables summarizes the results obtained. The source code is in the appendix. This is a diagram illustrating the final converged solution from one of the runs.

pict
Figure 2.25:final convergedsolution

Number of steps to reach convergence

These table show the number of iterations to converge. The first was based on using \(\varepsilon =0.1h^{2}\) for tolerance and the second used \(\varepsilon =h^{2}\)

Number of iteration to reach convergence using tolerance \(\varepsilon =0.1h^{2}\)







method \(h=2^{-3}\) \(h=2^{-4}\) \(h=2^{-5}\) \(h=2^{-6}\) \(h=2^{-7}\)
\(n=7\) \(n=15\) \(n=31\) \(n=63\) \(n=127\)












Jacobi \(82\) \(400\) \(1886\) \(8689\) note4






Gauss-Seidel \(43\) \(202\) \(944\) \(3446\) \(19674\)






SOR \(13\) \(26\) \(53\) \(106\) \(215\)






Number of iteration to reach convergence using tolerance \(\varepsilon =h^{2}\)







method \(h=2^{-3}\) \(h=2^{-4}\) \(h=2^{-5}\) \(h=2^{-6}\) \(h=2^{-7}\)
\(n=7\) \(n=15\) \(n=31\) \(n=63\) \(n=127\)












\(Jacobi\) \(52\) \(281\) \(1409\) \(6779\) \(31702\)






\(Gauss-Seidel\) \(28\) \(142\) \(706\) \(3391\) \(15852\)






\(SOR\) \(10\) \(20\) \(40\) \(80\) \(159\)






Convergence plots for each method

These error plots where generated only for tolerance \(\varepsilon =1\times h^{2}.\) They show how the log of the norm of the relative residual changes as the number of iterations changed until convergence is achieved.

In these plots, the \(yaxis\) is \(\log \left ( \frac{\left \Vert R^{\left [ k\right ] }\right \Vert }{\left \Vert f\right \Vert }\right ) \) or \(\log \left ( \frac{\left \Vert f-Au^{[k]}\right \Vert }{\left \Vert f\right \Vert }\right ) \), and the \(xaxis\) is the \(k\) value (the iteration number).

pict
Figure 2.26:error plot Jacobi

pict
Figure 2.27:error plot GS

pict
Figure 2.28:error plot SOR

Plots for comparing convergence of the 3 methods for \(\epsilon =1\times h^{2}\) for different h values

pict
Figure 2.29:prob1 compare 3h 25

pict
Figure 2.30:prob1 compare 3h 26

pict
Figure 2.31:prob1 compare 3h 27
2.4.4.8 Conclusions and summary

1.
SOR is the fastest iterative solver of the three solvers.
2.
SOR method required calculation of an optimal \(\omega \) to use. For this problem, this calculation was not difficult. In other problems it can be difficult to determine before hand the optimal \(\omega .\) Some SOR methods use an adaptive \(\omega \) where \(\omega \) is readjusted as the solution progresses.
3.
The use of relative residual to determine the condition of convergence required applying the matrix \(A\) without actually storing \(A\).
4.
Jacobi method required an additional auxiliary grid storage, hence its memory requirement was twice as much as Gauss-Seidel or SOR.
5.
Jacobi method was the simplest to implement, but it was the slowest to converge.
6.
Jacobi method is more suitable for use in parallel processing, where each grid point can be updated independent of the grid point next to it. This is not possible with Gauss-Seidel nor SOR due to the dependency of updates on its immediate grid points. However, if a red-black numbering is used, then it would be possible to implement these methods in parallel in 2 stages.
7.
All methods are guaranteed to converge eventually, as the spectral radius of the iterative matrix for each method is less than one.

2.4.5 Problem 2

   2.4.5.1 Part(a)
   2.4.5.2 Part (b)

pict
Figure 2.32:Problem 2
2.4.5.1 Part(a)

Given\[ u-\delta \Delta u=f \] And using standard 5 point Laplacian for the approximation of \(\Delta \), the above can be written as\begin{equation} u-\delta Au=f \tag{1} \end{equation} Where \(A\) is the Jacobian matrix for 2D\[ \frac{1}{h^{2}}\begin{pmatrix} -4 & 1 & 0 & 1 & 0 & 0 & 0\\ 1 & -4 & 1 & 0 & 1 & 0 & 0\\ 0 & 1 & -4 & 0 & 0 & 1 & 0\\ 1 & 0 & 0 & \ddots & 0 & 0 & 1\\ 0 & 1 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & -4 & 1\\ 0 & 0 & 0 & 1 & 0 & 1 & -4 \end{pmatrix} \] Hence (1) becomes\begin{equation} \left ( I-\delta A\right ) u=f \tag{2} \end{equation} Let \begin{equation} B=\left ( I-\delta A\right ) \tag{3} \end{equation} Then (2) becomes\begin{equation} Bu=f \tag{3A} \end{equation} To obtain the iterative matrix for the above system, the method of matrix splitting is used. Let \(B=M-N.\) Equation (3A) becomes\begin{align*} \left ( M-N\right ) u & =f\\ Mu & =Nu+f \end{align*}

\(M\) is selected so that it is invertible and such that \(M^{-1}\) is easy to compute, the iterative equation results\[ u^{\left [ k+1\right ] }=\left ( M^{-1}N\right ) u^{\left [ k\right ] }+M^{-1}f \] Where iterative matrix \(T_{j}\) is\begin{equation} T_{j}=\left ( M^{-1}N\right ) \tag{3B} \end{equation} For convergence it is required that the spectral radius of \(T_{j}\) be less than one. \(\rho \left ( T_{j}\right ) \) is the largest eigenvalue of \(T_{j}\) in absolute terms. The largest eigenvalue of \(T_{j}\) is now found as follows.

For the Jacobi method let \(M=D\), and \(N=L+U\), where \(D\) is the diagonal of \(B\), \(L\) is the negative of the strictly lower triangle matrix of \(B\) and \(U\) is negative of the strictly upper triangle matrix of \(B\).  (3B) becomes\[ T_{j}=D^{-1}\left ( L+U\right ) \] But \(B=D-L-U\), hence \(L+U=D-B\) and the above becomes\begin{align} T_{j} & =D^{-1}\left ( D-B\right ) \nonumber \\ & =I-D^{-1}B \tag{4} \end{align}

Now the spectral radius of \(T_{j}\) is determined. First \(D^{-1}\) is found. But first \(B\,\ \)needs to be determined. From (3)\begin{align*} B & =\begin{pmatrix} 1 & 0 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 0 & 1 & 1 & 0 & 0 & 0 & 0\\ 0 & 0 & 0 & \ddots & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1 & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1 \end{pmatrix} -\frac{\delta }{h^{2}}\begin{pmatrix} -4 & 1 & 0 & 1 & 0 & 0 & 0\\ 1 & -4 & 1 & 0 & 1 & 0 & 0\\ 0 & 1 & -4 & 0 & 0 & 1 & 0\\ 1 & 0 & 0 & \ddots & 0 & 0 & 1\\ 0 & 1 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 1 & 0 & 1 & -4 & 1\\ 0 & 0 & 0 & 1 & 0 & 1 & -4 \end{pmatrix} \\ & =\begin{pmatrix} 1+\frac{4\delta }{h^{2}} & -\frac{\delta }{h^{2}} & 0 & -\frac{\delta }{h^{2}} & 0 & 0 & 0\\ -\frac{\delta }{h^{2}} & 1+\frac{4\delta }{h^{2}} & -\frac{\delta }{h^{2}} & 0 & -\frac{\delta }{h^{2}} & 0 & 0\\ 0 & -\frac{\delta }{h^{2}} & 1+\frac{4\delta }{h^{2}} & 0 & 0 & -\frac{\delta }{h^{2}} & 0\\ -\frac{\delta }{h^{2}} & 0 & 0 & \ddots & 0 & 0 & -\frac{\delta }{h^{2}}\\ 0 & -\frac{\delta }{h^{2}} & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & -\frac{\delta }{h^{2}} & 0 & -\frac{\delta }{h^{2}} & 1+\frac{4\delta }{h^{2}} & -\frac{\delta }{h^{2}}\\ 0 & 0 & 0 & -\frac{\delta }{h^{2}} & 0 & -\frac{\delta }{h^{2}} & 1+\frac{4\delta }{h^{2}}\end{pmatrix} \end{align*}

Therefore, \(D\) the diagonal matrix of \(B\) is easily found\[ D=\begin{pmatrix} 1+\frac{4\delta }{h^{2}} & 0 & 0 & \cdots & 0 & 0 & 0\\ 0 & 1+\frac{4\delta }{h^{2}} & 0 & \cdots & 0 & 0 & 0\\ 0 & 0 & 1+\frac{4\delta }{h^{2}} & \cdots & 0 & 0 & 0\\ 0 & 0 & 0 & \ddots & 0 & 0 & 0\\ 0 & 0 & 0 & 0 & \ddots & 0 & 0\\ 0 & 0 & 0 & 0 & 0 & 1+\frac{4\delta }{h^{2}} & 0\\ 0 & 0 & 0 & 0 & 0 & 0 & 1+\frac{4\delta }{h^{2}}\end{pmatrix} \] Now that \(D\) is known, the eigenvalue \(\mu _{kl}\) of the iteration matrix \(T_{j}\) shown in (4) can be written down as\begin{align} T_{j} & =I-D^{-1}B\nonumber \\ & \Rightarrow \nonumber \\ \mu _{kl} & =1-\left ( 1+\frac{4\delta }{h^{2}}\right ) ^{-1}\lambda _{kl} \tag{5} \end{align}

where \(\lambda _{kl}\) is the eigenvalues of \(B\). But from (3), \(B=\left ( I-\delta A\right ) \), hence (5) becomes\begin{equation} \mu _{kl}=1-\left ( 1+\frac{4\delta }{h^{2}}\right ) ^{-1}\left ( 1-\delta \nu _{kl}\right ) \tag{6} \end{equation} Where \(\nu _{kl}\) is the eigenvalue of \(A\), the standard Jacobi \(A\) matrix for 2D, with eigenvalues given in textbook (page 63, equation 3.15)\[ \nu _{kl}=\frac{2}{h^{2}}\left ( \cos \left ( k\pi h\right ) +\cos \left ( l\pi h\right ) -2\right ) \] Using this in (6) results in\begin{align*} \mu _{kl} & =1-\left ( 1+\frac{4\delta }{h^{2}}\right ) ^{-1}\left ( 1-\frac{2\delta }{h^{2}}\cos \left ( k\pi h\right ) -\frac{2\delta }{h^{2}}\cos \left ( l\pi h\right ) +\frac{4\delta }{h^{2}}\right ) \\ & =1-\left ( \frac{h^{2}}{h^{2}+4\delta }\right ) \left ( 1-\frac{2\delta }{h^{2}}\cos \left ( k\pi h\right ) -\frac{2\delta }{h^{2}}\cos \left ( l\pi h\right ) +\frac{4\delta }{h^{2}}\right ) \\ & =1-\left ( \frac{h^{2}}{h^{2}+4\delta }\right ) -\left ( \frac{4\delta }{h^{2}+4\delta }\right ) +\frac{2\delta }{h^{2}}\left ( \frac{h^{2}}{h^{2}+4\delta }\right ) \left \{ \cos \left ( k\pi h\right ) +\cos \left ( l\pi h\right ) \right \} \\ & =\left ( \overset{=0}{\frac{\overbrace{h^{2}+4\delta -h^{2}-4\delta }}{h^{2}+4\delta }}\right ) +\left ( \frac{2\delta }{h^{2}+4\delta }\right ) \left \{ \cos \left ( k\pi h\right ) +\cos \left ( l\pi h\right ) \right \} \\ & =\left ( \frac{2\delta }{h^{2}+4\delta }\right ) \left \{ \cos \left ( k\pi h\right ) +\cos \left ( l\pi h\right ) \right \} \end{align*}

The largest value of the above occurs when \(\cos \left ( k\pi h\right ) +\cos \left ( l\pi h\right ) \) is maximum, which is \(2\). Therefore\[ \rho \left ( T_{j}\right ) =\left ( \frac{4\delta }{h^{2}+4\delta }\right ) \] Which is  less than one for any \(\delta >0\).

Hence it is now shown that Jacobi iteration converges for this system.

2.4.5.2 Part (b)

Reducing the error by factor \(10^{-6}\) implies\begin{equation} \left \Vert e^{k}\right \Vert <10^{-6}\left \Vert e^{0}\right \Vert \tag{1} \end{equation} but by definition \[ \left \Vert e^{k}\right \Vert =\rho _{sor}^{k}\left \Vert e^{0}\right \Vert \] Hence (1) becomes\[ \rho _{sor}^{k}<10^{-6}\] Where the solution for \(k\) at equality is found (rounded to the largest integer if needed). Hence the above becomes, after taking logarithms of both sides\begin{align} k\log \rho _{sor} & =-6\nonumber \\ k & =\frac{-6}{\log \rho _{sor}} \tag{2} \end{align}

But \[ \rho _{sor}=\omega _{opt}-1 \] Where\[ \omega _{opt}=\frac{2}{1+\sqrt{1-\rho _{j}^{2}}}\] And \(\rho _{j}=\left ( \frac{4\delta }{h^{2}+4\delta }\right ) \) from part(a), hence the above becomes\[ \omega _{opt}=\frac{2}{1+\sqrt{1-\left ( \frac{4\delta }{h^{2}+4\delta }\right ) ^{2}}}\] Hence\[ \rho _{sor}=\frac{2}{1+\sqrt{1-\left ( \frac{4\delta }{h^{2}+4\delta }\right ) ^{2}}}-1 \] Substituting the numerical values \(h=10^{-2},\delta =10^{-3}\) in the above results in\begin{align*} \rho _{sor} & =\frac{2}{1+\sqrt{1-\left ( \frac{4\left ( 10^{-3}\right ) }{\left ( 10^{-2}\right ) ^{2}+4\left ( 10^{-3}\right ) }\right ) ^{2}}}-1\\ & =0.64 \end{align*}

Therefore, from (2)\begin{align*} k & =\frac{-6}{\log (0.64)}\\ & =30.95 \end{align*}

rounding up gives\[ k=31 \]

2.4.6 Problem 3

   2.4.6.1 Part(a)
   2.4.6.2 Result of computation
   2.4.6.3 Part(b)

pict
Figure 2.33:Problem 3
2.4.6.1 Part(a)

To solve the problem using direct solver, the matrix \(A\) is constructed (sparse matrix), and the vector \(f\) is evaluated using the given function \(f\left ( x,y\right ) \), this results in an \(Au=f\) system, which is then solved using a direct solver.

Recall from problem (1) that the \(A\) matrix has the following form (as an example, for \(3\times 3\) internal grid, or for \(h=0.2\))\[\begin{pmatrix} -4 & 1 & 0 & 1 & 0 & 0 & 0 & 0 & 0\\ 1 & -4 & 1 & 0 & 1 & 0 & 0 & 0 & 0\\ 0 & 1 & -4 & 0 & 0 & 1 & 0 & 0 & 0\\ 1 & 0 & 0 & -4 & 1 & 0 & 1 & 0 & 0\\ 0 & 1 & 0 & 1 & -4 & 1 & 0 & 1 & 0\\ 0 & 0 & 1 & 0 & 1 & -4 & 0 & 0 & 1\\ 0 & 0 & 0 & 1 & 0 & 0 & -4 & 1 & 0\\ 0 & 0 & 0 & 0 & 1 & 0 & 1 & -4 & 1\\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 1 & -4 \end{pmatrix}\begin{pmatrix} U_{1,1}\\ U_{2,1}\\ U_{3,1}\\ U_{1,2}\\ U_{2,2}\\ U_{3,2}\\ U_{1,3}\\ U_{2,3}\\ U_{3,3}\end{pmatrix} =h^{2}\begin{pmatrix} f_{1,1}\\ f_{2,1}\\ f_{3,1}\\ f_{1,2}\\ f_{2,2}\\ f_{3,2}\\ f_{1,3}\\ f_{2,3}\\ f_{3,3}\end{pmatrix} \] The matrix \(A\) is set up, as sparse for the following set of values



\(h\) internal grid size (\(n=\frac{1}{h}-1\))




\(2^{-5}\) \(31\times 31\)


\(2^{-6}\) \(63\times 63\)


\(2^{-7}\) \(127\times 127\)


\(2^{-8}\) \(255\times 255\)


\(2^{-9}\) \(511\times 511\)


\(2^{-10}\) \(1023\times 1023\)


A function is written to evaluate \(f\left ( x,y\right ) \) at each of the internal grid points and reshaped to be column vector in the order shown above. Then the direct solver is called.

Next, the SOR solver is used for each of the above spacings. First \(\omega \) was calculated for each \(h\) using \(\omega _{opt}\approx 2\left ( 1-\pi h\right ) \) resulting in



\(h\) \(\omega _{opt}\ \)




\(2^{-5}\) \(1.803\,7\)


\(2^{-6}\) \(1.901\,8\)


\(2^{-7}\) \(1.950\,9\)


\(2^{-8}\) \(1.975\,5\)


\(2^{-9}\) \(1.987\,7\)


\(2^{-10}\) \(1.993\,9\)


Then the SOR solver which was written in problem (1) was called for each of the above cases. The next section shows the results obtained. The source code is in the appendix.

2.4.6.2 Result of computation

The following is an image of f(x,y) on the grid

pict
Figure 2.34:image of \(f(x,y)\)

And the solution obtained by direct solver on 2D is the following (implemented in Matlab and in Mathematica)

pict
Figure 2.35:Solution by direct solver, Matlab

pict
Figure 2.36:Solution by direct solver using Mathematica

The CPU results below are in seconds. The function cputime() was used to obtain cpu time used in Matlab. For SOR, the cpu time was that of the whole iterative loop until convergence was achieved. In Mathematica, the command Timing[] which measures the CPU time was used. These are the results obtained using Matlab 2010a and using Mathematica 7.05

In this table, the grid size \(n\) represents the number of internal grid points in one dimension. For example, for \(n=31\), the grid size will be \(31\times 31\). The number of non zero elements shown in the table relates to storage used by the sparse matrix and was obtained in Matab by calling nnz(A).









\(h\) \(n\) \(N\) number Direct
Direct
SOR \(k\)
number non zero Solver CPU
Solver CPU
Solver SOR number
of unknowns elements MATLAB
Mathematica
CPU of iterations
















\(2^{-5}\) \(31\) \(961\) \(4,681\) \(0.015\) \(0.016\) \(0\) \(68\)








\(2^{-6}\) \(63\) \(3,969\) \(19,593\) \(0.125\) \(0.016\) \(0.094\) \(143\)








\(2^{-7}\) \(127\) \(16,129\) \(80,137\) \(0.250\) \(0.063\) \(0.6\) \(306\)








\(2^{-8}\) \(255\) \(65,025\) \(324,105\) \(1.544\,\) \(0.344\) \(5.2\) \(661\)








\(2^{-9}\) \(511\) \(261,121\) \(1,303,561\) \(5.538\) \(1.90\) \(48.9\) \(1427\)








\(2^{-10}\) \(1023\) \(1,046,529\) \(5,228,553\) \(27.113\) \(14.57\) \(532\) \(3088\)








These 2 plots illustrate the CPU difference, done on a normal scale and on log scale. (using Matlab results only).

pict
Figure 2.37:prob3 part a compare CPU normal scale

pict
Figure 2.38:plot prob3 part a compare CPU

Comments on results obtained

CPU performance for SOR is given by\[ work=number\ iterations\times work\ per\ iteration \] The number of iterations depends on the constant used for tolerance. Let \(k\) be the number of iterations, and let the tolerance be \(Ch^{2}\) where \(h\) is the spacings. Hence\[ k=\frac{\log \left ( Ch^{2}\right ) }{\log \rho _{sor}}=\frac{\log C+2\log h}{\log \left ( 1-2\pi h\right ) }\approx \frac{\log C+2\log h}{-2\pi h}\approx O\left ( h^{-1}\log h\right ) \] But \(h=O\left ( \frac{1}{n}\right ) \) where \(n\) is the number of grid points in one dimension. Therefore\[ k=O\left ( n\log n\right ) \] And since there are \(n^{2}\) unknowns, the work per iteration is \(O\left ( n^{2}\right ) \), hence for SOR performance, work becomes\[ CPU_{sor}=O\left ( n^{3}\log n\right ) \] Expressing this in terms of the unknowns \(N=n^{2}\) gives\[ CPU_{sor}=O\left ( N^{\frac{3}{2}}\log N\right ) \] For direct solver, the work is proprprtional to \(\left ( Nb\right ) \) where \(b\) is the banwidth (when using nested dissection method)6 . The bandwidth is \(n\), hence for direct solver on 2D using sparse matrices, the performance is \(n^{3}\)\[ CPU_{direct}=O\left ( N^{\frac{3}{2}}\right ) \] In summary




method CPU (in terms of number of unknowns \(N\)) CPU in terms of \(n\)



SOR 2D \(O\left ( N^{\frac{3}{2}}\log N\right ) \) \(O\left ( n^{3}\log n\right ) \)



direct solver 2D \(O\left ( N^{\frac{3}{2}}\right ) \) \(O\left ( n^{3}\right ) \)



For small number of unknowns, SOR was very competitive with direct solver but when the number of unknowns became larger than about \(N\approx 100\), the direct solver is faster as the effect of the \(\log n\) factor starts to take effect on the performance of \(SOR\). The results shown in the plots above confirmed this performance analysis.

2.4.6.3 Part(b)

The goal is to solve\[ \frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}+\frac{\partial ^{2}u}{\partial z^{2}}=-\exp (-(x-0.25)^{2}-(y-0.6)^{2}-z^{2}) \] On the unit cube. Referring to the following diagram made to help in setting up the 3D scheme to approximate the above PDE

pict
Figure 2.39:3D axis

The discrete approximation to the PDE can be written as\[ \frac{\partial ^{2}u}{\partial x^{2}}+\frac{\partial ^{2}u}{\partial y^{2}}+\frac{\partial ^{2}u}{\partial z^{2}}=\frac{1}{h^{2}}\left ( U_{i-1,j,k}+U_{i+1,j,k}+U_{i,j-1},k+U_{i,j+1,k}+U_{i,k,k-1}+U_{i,j,k+1}-6U_{i,j,k}\right ) \] Hence the SOR scheme becomes\[ U_{i,j,k}^{\left [ k+1\right ] }=\frac{\omega }{6}\left ( U_{i-1,j,k}^{\left [ k+1\right ] }+U_{i+1,j,k}^{\left [ k\right ] }+U_{i,j-1,k}^{\left [ k+1\right ] }+U_{i,j+1,k}^{\left [ k\right ] }+U_{i,j,k-1}^{\left [ k+1\right ] }+U_{i,j,k+1}^{\left [ k\right ] }-h^{2}f_{i,j}\right ) +\left ( 1-\omega \right ) U_{i,j,k}^{\left [ k\right ] }\] For the direct solver, the \(A\) matrix needs to be formulated. From\[ \frac{1}{h^{2}}\left ( U_{i-1,j,k}+U_{i+1,j,k}+U_{i,j-1},k+U_{i,j+1,k}+U_{i,k,k-1}+U_{i,j,k+1}-6U_{i,j,k}\right ) =f_{i,j,k}\] And solving for \(U_{i,j,k}\) results in\[ U_{i,j,k}=\frac{1}{6}\left ( U_{i-1,j,k}+U_{i+1,j,k}+U_{i,j-1},k+U_{i,j+1,k}+U_{i,k,k-1}+U_{i,j,k+1}-h^{2}f_{i,j,k}\right ) \] To help make the \(A\) matrix, using an example with \(n=2,\) the following diagram is made with the standard numbering on each node

pict
Figure 2.40:3d grid example

By traversing the grid, left to right, then inwards into the paper, then upwards, the following \(A\) matrix results

pict
Figure 2.41:repating A sructure n2

One can see the recursive pattern involved in these \(A\) matrices. Each \(A\) matrix contains inside it a block on its diagonal which repeats \(n\) times. Each block in turn, contain inside it, on its diagonal, smaller block, which also repeats \(n\) times.

It is easier to see the pattern of building \(A\) by using numbers for the grid points, and label them in the same order as they would be visited, this allowed one to see the connection between each grid point to the other much easier. For example, for \(n=2\),

pict
Figure 2.42:3d grid n2 numbers

One can see now more easily the connections. grid point 1 has connection to only \(2,3,5\) points. This means when looking at the \(A\) matrix, there will be a \(1\) in the first row, at columns \(2,3,5\). Similarly, point \(2\) has connections only to \(1,4,6\), which means in the second row, there will be a \(1\) at columns \(1,4,6\). Extending the number of points to \(n=3\) to better see the pattern of \(A\) results in

pict
Figure 2.43:3d grid n3 numbers

From the above, one can see clearly that, for example, point \(1\) is connected only to \(2,4,10\) and \(2\) is connected to \(1,3,5,11\) and so on. The above shows that each point will have a connection to a point which is numbered \(n^{2}\) higher than the grid point itself. \(n^{2}\) is the size of the grid in each surface. Hence, the general \(A\) matrix, for the above example, can now be written as

pict
Figure 2.44:A sructure 3D

One can see the recursive structure again. There are \(n=3\) main repeating blocks on the diagonal, and each one of them in turn has \(n=3\) repeating blocks on its own diagonal. Here \(n=3\), the number of grid points along one dimension.

Now that the \(A\) structure is understood, the Matlab code for filling the sparse matrix is modified for the 3D case as follows

To test, for example, for \(n=2\)

Using the above function, the solution was found using direct solver.

Result of computation

The results for the 3D solver are as follows. In this table, \(n\,\) represents the number of grid points in one dimension. Hence \(n=10\) represents a 3D space of \([10,10,10]\ \)points. The number of non zero elements in the table relates to the sparse matrix used for the direct solver and was obtained using Matab call nnz(A).









\(h\) \(n\) \(N\) total number make sparse A\(\backslash \)f Total Direct Total SOR SOR number
unknowns (\(n^{3}\)) CPU time CPU time Solver CPU Solver CPU iterations
















\(0.090909\) \(10\) \(1,000\) \(0.047\) \(0\) \(0.047\) \(0\) \(23\)








\(0.047619\) \(20\) \(8,000\) \(0.062\) \(0.44\) \(0.502\,\) \(0.078\) \(44\)








\(0.032258\) \(30\) \(27,000\) \(0.016\) \(3.90\) \(3.90\) \(0.405\) \(65\)








\(0.027778\) \(35\) \(42,875\) \(0.359\) \(8.70\) \(8.80\) \(0.75\) \(77\)








\(0.024390\) \(40\) \(64,000\) \(0.328\) \(21.20\) \(21.50\) \(1.29\) \(88\)








\(0.021739\) \(45\) \(91,125\) \(0.296\) \(39.80\) \(40.00\) \(2.11\) \(100\)








\(0.019608\) \(50\) \(125,000\) \(0.624\) \(84.20\) \(84.80\) \(3.24\) \(112\)








\(0.017857\) \(55\) \(166,375\) \(0.421\) \(157.30\) \(157.70\) \(4.9\) \(125\)








\(0.016393\) \(60\) \(216,000\) \(0.889\) \(244.10\) \(244.20\) \(7.17\) \(138\)








For the direct solver, Matlab ran out of memory at \(n=65\) as shown below

This plot illustrates the CPU difference table on a log scale

pict
Figure 2.45:prob3 part B compare CPU log scale

Comments on results obtained

CPU performance for SOR is given by\[ \text{work}=\text{number\ iterations}\times \text{work\ per\ iteration}\] The number of iterations depends on the constant used for tolerance. Let \(k\) be the number of iterations, and let the tolerance be \(Ch^{2}\) where \(h\) is the spacings. Hence\[ k=\frac{\log \left ( Ch^{2}\right ) }{\log \rho _{sor}}=\frac{\log C+2\log h}{\log \left ( 1-2\pi h\right ) }\approx \frac{\log C+2\log h}{-2\pi h}\approx O\left ( h^{-1}\log h\right ) \] But \(h=O\left ( \frac{1}{n}\right ) \) where \(n\) is the number of grid points in one dimension. Hence\[ k=O\left ( n\log n\right ) \] And since there are \(n^{3}\) unknowns (compared to \(n^{2}\) in 2D), then work per iteration is \(O\left ( n^{3}\right ) \), hence for SOR performance becomes\[ CPU_{sor}=O\left ( n^{4}\log n\right ) \] Expressing this in terms of \(N=n^{3}\) as the number of unknowns, gives\[ CPU_{sor}=O\left ( N^{\frac{4}{3}}\log N\right ) \] For direct solver, the work is proprprtional to \(\left ( Nb\right ) \) where \(b\) is the banwidth (when using nested dissection method)7 . The bandwidth is \(n^{2}\) in this case and not \(n\) as was with 2D. Hence the total cost is\begin{align*} CPU_{direct} & =O\left ( N\times n\right ) \\ & =O\left ( n^{5}\right ) \\ & =O\left ( N^{\frac{5}{3}}\right ) \end{align*}

Hence




method CPU (in terms of \(N\)) CPU in terms of \(n\)



SOR 3D \(O\left ( N^{\frac{4}{3}}\log N\right ) \) \(O\left ( n^{4}\log n\right ) \)



direct solver 3D \(O\left ( N^{\frac{5}{3}}\right ) \) \(O\left ( n^{5}\right ) \)



The above shows that SOR is faster than direct solver performance. The results shown in the plots above confirmed this analytical performance prediction showing SOR to be faster. To verify the above, a plot was made using the above complexity cost measure to determine if the resulting shape matches the one obtained from the actual runs above. The following plot shows the complexity cost made on sequence of values that represent \(n\)

pict
Figure 2.46:verify cost of 3D

The following matlab code was in part used to generate the above

n=[10 20 30 35 40 45 50 55 60 65];
plot(n,log10(n.^5),'ro',n,log10(n.^4.*log10(n)),'*');

It can be seen that the cost curves matches those produced with the actual runs, but for a scaling factor as can be expected.

Therefore one can conclude that in 3D SOR is faster than direct solver. This result was surprising as the expectation was that the direct solver will be faster than SOR in 3D as it was in 2D. Attempts were made to find any errors in the code that can explain this, and none were found.

2.4.7 Problem 4

   2.4.7.1 part(a)
   2.4.7.2 part (b)
   2.4.7.3 Part(c)

pict
Figure 2.47:Problem 3

Periodic boundary conditions mean that the solution must be such that \(u^{\prime }\left ( 0\right ) =u^{\prime }\left ( 1\right ) \) and \(u\left ( 0\right ) =u\left ( 1\right ) .\) As an example, the following is a solution to \(u^{\prime \prime }\left ( x\right ) =f\left ( x\right ) \) with Periodic boundary conditions just for illustration

pict
Figure 2.48:example periodic BC
2.4.7.1 part(a)

Using the standard numbering system

pict
Figure 2.49:problem 4 part a scheme

In the above diagram, \(u_{0}\) represents \(u\) at \(x=0\) and \(u_{N+1}\) represents \(u\) at \(x=1\). The 3 point discrete Laplacian for 1-D at \(x_{0}\) is given by \begin{equation} u_{0}^{\prime \prime }=\frac{u_{-1}-2u_{0}+u_{1}}{h^{2}} \tag{1} \end{equation} where \(x_{-1}\) is an imaginary grid point to the left of \(x_{0}\) in the diagram above.

Expanding \(u_{-1}\) about \(u_{0}\) by Taylor results in \(u_{-1}=u_{0}-hu_{0}^{\prime }\,,\) hence \begin{equation} u_{0}^{\prime }=\frac{u_{0}-u_{-1}}{h}\tag{2} \end{equation} Similarly, by Taylor expansion of \(u_{N}\) about \(u_{N+1}\) results in \[ u_{N}=u_{N+1}-hu_{N+1}^{\prime }\] Hence \begin{equation} u_{N+1}^{\prime }=\frac{u_{N+1}-u_{N}}{h}\tag{3} \end{equation} But \(u_{0}^{\prime }=u_{N+1}^{\prime }\) from boundary conditions, hence (2)=(3) which results in\[ \frac{u_{0}-u_{-1}}{h}=\frac{u_{N+1}-u_{N}}{h}\] Solving now for \(u_{-1}\) from the above gives\[ u_{-1}=u_{0}+u_{N}-u_{N+1}\] But \(u_{N+1}=u_{0},\) also from the boundary conditions, hence the above results in\[ u_{-1}=u_{N}\] Use the above value of \(u_{-1}\) in (1) gives\[ u_{0}^{\prime \prime }=\frac{u_{N}-2u_{0}+u_{1}}{h^{2}}\] Similarly the derivation for \(u_{N+1}^{\prime \prime }\) results in\[ u_{N+1}^{\prime \prime }=\frac{u_{N}-2u_{N+1}+u_{1}}{h^{2}}\] For every other internal grid point \(i=1\cdots N\) the standard 3 point central difference is used\[ u_{i}^{\prime \prime }=\frac{1}{h^{2}}\left ( u_{i-1}-2u_{i}+u_{i+1}\right ) \] Therefore, the following set of equations are obtained\[\begin{array} [c]{lll}\frac{1}{h^{2}}\left ( u_{N}-2u_{0}+u_{1}\right ) =f_{0} & & i=0\\ \frac{1}{h^{2}}\left ( u_{i-1}-2u_{i}+u_{i+1}\right ) =f_{i} & & i=1\cdots N\\ \frac{1}{h^{2}}\left ( u_{N}-2u_{N+1}+u_{1}\right ) =f_{N+1} & & i=N+1 \end{array} \] And the system can now be put in the form \(Au=f\) resulting in\begin{equation} \frac{1}{h^{2}}\begin{pmatrix} -2 & 1 & 0 & 0 & \cdots & 1 & 0\\ 1 & -2 & 1 & 0 & \cdots & 0 & 0\\ 0 & 1 & -2 & 1 & 0 & \cdots & 0\\ 0 & 0 & 1 & -2 & 1 & 0 & \vdots \\ 0 & 0 & 0 & \cdots & \ddots & \cdots & 0\\ 0 & 0 & 0 & \cdots & 1 & -2 & 1\\ 0 & 1 & 0 & \cdots & 0 & 1 & -2 \end{pmatrix}\begin{pmatrix} u_{0}\\ u_{1}\\ u_{2}\\ u_{3}\\ \vdots \\ u_{N}\\ u_{N+1}\end{pmatrix} =\begin{pmatrix} f_{0}\\ f_{1}\\ f_{2}\\ f_{3}\\ \vdots \\ f_{N}\\ f_{N+1}\end{pmatrix} \tag{4} \end{equation} The above \(A\) matrix is singular since \(A\mathbf{b}=\mathbf{0}\) for \(\mathbf{b}\) the vector \(\mathbf{1}^{\mathbf{T}}\).  Hence the null space of \(A\) contains a vector other than the \(\mathbf{0}\) vector meaning that \(A\) is singular.

To determine the dimension of the null space, the rank of \(A\) \(\ \)is determined. Removing the last column and the last row of \(A\) results in an \(n-1\) by \(n-1\) matrix\[ A_{n-1}=\begin{pmatrix} -2 & 1 & 0 & 0 & \cdots & 1\\ 1 & -2 & 1 & 0 & \cdots & 0\\ 0 & 1 & \ddots & 1 & 0 & \cdots \\ 0 & 0 & 1 & -2 & 1 & 0\\ 0 & 0 & 0 & 1 & -2 & 1\\ 0 & 0 & 0 & \cdots & 1 & -2 \end{pmatrix} \] The square matrix inside of \(A_{n-1}\) that extends from the first row to the one row before the last row is of size \(n-2\)\[ A_{n-2}=\begin{pmatrix} -2 & 1 & 0 & 0 & \cdots \\ 1 & -2 & 1 & 0 & \cdots \\ 0 & 1 & \ddots & 1 & 0\\ 0 & 0 & 1 & -2 & 1\\ 0 & 0 & 0 & 1 & -2 \end{pmatrix} \] And this matrix is a full rank as it is the A matrix used for 1-D with Dirichlet boundary conditions and this matrix is known to be invertible (same one used in HW2).

Therefore, the rank of \(A\) can not be less than \(n-2\) where \(n\) is the size of \(A\).

In other words, the size of the null space of \(A\) can at most be \(2\). To determine if the size of the null space of \(A\) can be just one, the matrix \(A_{n-1}\)shown above has to be invertible as well\(.\)

One way to show that \(A_{n-1}\) is invertible, is to show that the last column of \(A_{n-1}\) is linearly independent to any of the remaining columns of \(A_{n-1}.\)

The last column of \(A_{n-1}\) is \(c_{n-1}\) \(\begin{pmatrix} 1\\ 0\\ \vdots \\ 0\\ 1\\ -2 \end{pmatrix} \) and this column is linearly independent with the first column of \(A_{n-1}\) which is \(c_{1}=\begin{pmatrix} -2\\ 1\\ 0\\ \vdots \\ 0\\ 0 \end{pmatrix} \)since \(a\times c_{n-1}+b\times c_{1}=0\) only when \(a,b\) are both zero. The same can be shown with all the other columns of \(A_{n-1},\) they are all linearly independent to the last columns of \(c_{n-1}\). Since all the other \(n-2\) columns of \(A_{n-1}\) are linearly independent with each others (they make up the Dirichlet matrix known to be invertible) then \(A_{n-1}\) is invertible. This shows that the rank of \(A\) is \(n-1\), hence the null space of \(A\) has dimension 1 only. In other words, only the and only the vector \(\mathbf{1}^{\mathbf{T}}\) in the null of \(A.\) (Since A is symmetric, the null space of the adjoint of A is the same).

2.4.7.2 part (b)

In terms of looking at the conditions of solvability, the continuous case is considered and then the conditions are translated to the discrete case.

The pde \(u^{\prime \prime }\left ( x\right ) =f\left ( x\right ) \) with periodic boundary conditions has an eigenvalue which is zero (the boundary conditions \(u^{\prime }\left ( 1\right ) =u^{\prime }\left ( 0\right ) \) results in this), hence\[ 0=\int _{0}^{1}f\left ( x\right ) dx \] Is the solvability condition which results from the \(u^{\prime }\left ( 1\right ) =u^{\prime }\left ( 0\right ) \) boundary conditions. (same argument that was carried out in part (d), problem 1, HW1 which had Neumann boundary conditions is used). Now, what solvability conditions does \(u\left ( 0\right ) =u\left ( 1\right ) \) add to this if any?

Since\[ u^{\prime \prime }\left ( x\right ) =f\left ( x\right ) \] Then integrating once gives\[ u^{\prime }\left ( 1\right ) -u^{\prime }\left ( 0\right ) =\int _{0}^{1}f\left ( x\right ) dx+C_{1}\] But since \(u^{\prime }\left ( 1\right ) =u^{\prime }\left ( 0\right ) \), then the above implies that\[ 0=C_{1}\] And integrating twice the PDE results in\[ u\left ( 1\right ) -u\left ( 0\right ) =C_{2}\] But \(u\left ( 1\right ) -u\left ( 0\right ) =0\,\), hence \(C_{2}=0\). So the only solvability condition is based on the fact that an eigenvalue is zero, which implies\[ \int _{0}^{1}f\left ( x\right ) dx=0 \] This is the same as was the case with Neumann boundary conditions. In the discrete case, this implies that solvability condition becomes the discrete integration (Riemman sum)\[ h{\displaystyle \sum \limits _{j=0}^{j=N+1}} f\left ( jh\right ) =0 \] For 2D, by extension of the above, there will be 2 eigenvalues of zero values, hence the discrete solvability condition becomes\[ h^{2}\ast{\displaystyle \sum \limits _{i=0}^{i=N+1}} \left ({\displaystyle \sum \limits _{j=0}^{j=N+1}} f\left ( ih,jh\right ) \right ) =0 \]

2.4.7.3 Part(c)

Since this is an \(iff\) problem, then the following needs to be shown

1.
If \(v\) is in the null space of \(A\) then \(v\) is an eigenvector of \(T\) with eigenvalue 1
2.
If \(v\) is an eigenvector of \(T\) with eigenvalue 1 then \(v\) is in the null space of \(A\)

Solving part(1)

Since \(v\) is in null space of \(A\), then by definition\[ A\mathbf{v}=\mathbf{0}\] But \(A=M-N\), hence the above becomes\begin{align*} \left ( M-N\right ) \mathbf{v} & =\mathbf{0}\\ M\mathbf{v}-N\mathbf{v} & =\mathbf{0} \end{align*}

Since \(M\) is invertible by definition, then \(M^{-1}\) exists. Premultiply both sides by \(M^{-1}\)\[ M^{-1}M\mathbf{v}-M^{-1}N\mathbf{v}=M^{-1}\mathbf{0}\] But \(M^{-1}\mathbf{0=0}\) then the above becomes\begin{align*} I\mathbf{v}-M^{-1}N\mathbf{v} & =0\\ M^{-1}N\mathbf{v} & =\mathbf{v}\\ T\mathbf{v} & =\mathbf{v} \end{align*}

Therefore \(\mathbf{v}\) is an eigenvector of \(T\) with an eigenvalue of 1.

Solving part(2)

Since \(\mathbf{v}\) is an eigenvector of \(T\,\) with eigenvalue 1 then\[ T\mathbf{v}=\lambda \mathbf{v}\] With \(\lambda =1\), and since \(T=M^{-1}N\), then the above becomes\[ M^{-1}N\mathbf{v}=\mathbf{v}\] Multiply both sides by \(M\)\[ N\mathbf{v}=M\mathbf{v}\] Therefore\begin{align*} M\mathbf{v-}N\mathbf{v} & \mathbf{=0}\\ \left ( M-N\right ) \mathbf{v} & =\mathbf{0} \end{align*}

Hence\[ A\mathbf{v}=\mathbf{0}\] Therefore \(\mathbf{v}\) is in the null space of \(A.\)

2.4.8 References

1.
Applied Mathematica by David Logan, chapter 8

2.4.9 Source code