May 3,2007 Compiled on September 9, 2023 at 7:22pm

This note compares the result of computing the numerical derivative to \(\arctan \left ( x\right ) \) at \(x=\sqrt {2}\) using
Taylor approximation using single ﬂoating point and double ﬂoating point. This was
done using Matlab. With Matlab, we can do single ﬂoating point computation using
the single command. The default in Matlab is to do all the computations in double
precision.

The approximation used is \(f^{\prime }\left ( x\right ) =\frac {1}{h}\left ( f\left ( x+h\right ) -f\left ( x\right ) \right ) \) with \(h\) starting at \(1\) and halving it at each iteration.

The exact answer to \(\frac {d\arctan \left ( x\right ) }{dx}\) evaluated at \(x=\sqrt {2}\) is \(1/3.\) The results below show that using single precision, the
numerical derivative keeps getting closer the exact answer up to iteration 12. The best
answer is accuracy to 4 decimal places. After iteration 12, subtractive cancellation (loss
of signiﬁcance, L.O.S) become more dominant, and the result starts to become less
accurate.

Using double precision, we see that we can go up to iteration 27 before loss of signiﬁcance kicks
in. The best numerical result at this point is accurate to 8 decimal points. Hence the accuracy is
twice that of single precision.

The following diagram displays the results table for single precision, with a red box around the
line where the numerical results starts to be aﬀected by L.O.S. with the Matlab code
used.

The following diagram displays the results table for double precision, with a red box around the
line where the numerical results starts to be aﬀected by L.O.S. The Matlab code is
the same as before, expect we simplify remove the command single wherever it was
used.

Source code listing

%Nasser M. Abbasi. Do computation using 32 bit%Computing derivative of arctan(x) at =sqrt(2) as a function%of changing h in Taylor approximation.h= single(1);M= 26;X= single(sqrt(2));f= @(x) single(atan(x));F1= f(X);S=zeros(25,6,'single');fork = 1:MF2 = f(X+h);d = single(F2-F1);r = single(d/h);S(k,1) = k;S(k,2) = h;S(k,3) = F2;S(k,4) = F1;S(k,5) = d;S(k,6) = r;h = single(h/2);endformatlong gS