summaryrefslogtreecommitdiffstats
path: root/scilab_doc
diff options
context:
space:
mode:
authorMichaël Baudin <michael.baudin@scilab.org>2009-02-16 15:32:53 +0100
committerMichaël Baudin <michael.baudin@scilab.org>2009-02-16 15:32:53 +0100
commita4169cafbc9a861e5b309249fc37dab0aa1277f2 (patch)
tree91224e67cdc11f816b8860b9b0a0616926d5cfb6 /scilab_doc
parenta2ad12e7c446410387c70bf418f098cdd0aae17a (diff)
downloadscilab-a4169cafbc9a861e5b309249fc37dab0aa1277f2.zip
scilab-a4169cafbc9a861e5b309249fc37dab0aa1277f2.tar.gz
First draft of -Scilab is not naive-
Diffstat (limited to 'scilab_doc')
-rw-r--r--scilab_doc/scilabisnotnaive/complexdivision.tex882
-rw-r--r--scilab_doc/scilabisnotnaive/introduction.tex25
-rw-r--r--scilab_doc/scilabisnotnaive/numericalderivative.tex44
-rw-r--r--scilab_doc/scilabisnotnaive/quadratic.tex6
-rw-r--r--scilab_doc/scilabisnotnaive/scilabisnotnaive.bib12
-rw-r--r--scilab_doc/scilabisnotnaive/scilabisnotnaive.pdfbin264460 -> 294290 bytes
6 files changed, 464 insertions, 505 deletions
diff --git a/scilab_doc/scilabisnotnaive/complexdivision.tex b/scilab_doc/scilabisnotnaive/complexdivision.tex
index 0d726d1..a38738f 100644
--- a/scilab_doc/scilabisnotnaive/complexdivision.tex
+++ b/scilab_doc/scilabisnotnaive/complexdivision.tex
@@ -1,33 +1,26 @@
1\section{Complex division} 1\section{Complex division}
2 2
3Dans cette partie, nous analysons le problème de la division complexe 3In that section, we analyse the problem of the complex division in Scilab.
4dans Scilab. Nous mettons en lumière pourquoi le traitement numérique interne 4We especially detail the difference between the mathematical, straitforward
5de Scilab est parfois inutilisé et pourquoi il est parfois inutile, voire 5formula and the floating point implementation. In the first part, we briefly report
6nuisible (rappelons que le but d'une introduction est d'attirer le lecteur 6the formulas which allow to
7vers la suite du texte, raison pour laquelle nous avons rédigé ces mots 7compute the real and imaginary parts of the division of two complex numbers.
8les plus provocants possible !). 8We then present the naïve algorithm based on these mathematical formulas.
9 9In the second part, we make some experiments in Scilab and compare our
10Nous détaillons en particulier la différence entre 10naïve algorithm with the \emph{/} Scilab operator.
11définition mathématique et implémentation en nombres flottants. 11In the third part, we analyse
12Nous montrons comment la division de nombres complexes est 12why and how floating point numbers must be taken into account when the
13effectué dans Scilab lorsque l'opérateur "/" est utilisé. 13implementation of such division is required.
14Nous montrons également que l'implémentation n'est pas
15utilisé de manière consistante dans Scilab.
16Nous montrerons enfin que les librairies utilisées dans
17les compilateurs Intel et gfortran traitent le problème.
18 14
19\subsection{Theory} 15\subsection{Theory}
20 16
21\subsubsection{Algebraic computations} 17The formula which allows to compute the
22 18real and imaginary parts of the division of two
23Il est de notoriété publique que la formule mathématique 19complex numbers is
24qui permet de calculer la division entre deux nombres complexes
25\begin{eqnarray} 20\begin{eqnarray}
26\frac{a + ib}{c + id} = \frac{ac + bd}{c^2 + d^2} + i \frac{bc - ad}{c^2 + d^2} 21\frac{a + ib}{c + id} = \frac{ac + bd}{c^2 + d^2} + i \frac{bc - ad}{c^2 + d^2}
27\end{eqnarray} 22\end{eqnarray}
28 23
29\subsubsection{Naive algorithm}
30
31The naive algorithm for the computation of the complex division 24The naive algorithm for the computation of the complex division
32is presented in figure \ref{naive-complexdivision}. 25is presented in figure \ref{naive-complexdivision}.
33 26
@@ -43,433 +36,321 @@ is presented in figure \ref{naive-complexdivision}.
43 36
44 37
45\subsection{Experiments} 38\subsection{Experiments}
46n'est pas robuste quand on utilise des nombres flottants \cite{368661}.
47En résumé, le problème arrive lorsque les nombres a, b, c ou d s'approchent
48du domaine de définition des nombres flottants, ce qui peut provoquer
49un overflow ou un underflow.
50
51Supposons que l'on traite la division suivante
52
53\begin{eqnarray}
54\frac{1 + I * 1}{1 + I * 1e308} = 1e308 - I * 1e-308
55\end{eqnarray}
56
57Utilisons la formule mathématique, naïve, pour vérifier cette division.
58
59\begin{eqnarray}
60den &=& c² + d² = 1² + (1e308)² = 1 + 1e616 ~1e616 \\
61e &=& (ac + bd)/ den = (1*1 + 1*1e308)/1e616 ~ 1e308/1e616 ~ 1e-308\\
62f &=& (bc - ad)/ den = (1*1 - 1*1e308)/1e616 ~ -1e308/1e616 ~ -1e-308
63\end{eqnarray}
64
65Numériquement, les choses ne se passent pas ainsi, car vmax = 1e308 est la
66valeur maximale avant overflow. Plus précisément, vmax est tel que
67tout nombre v > vmax satisfait
68\begin{eqnarray}
69(v * 2) / 2 != v
70\end{eqnarray}
71Autrement dit, la multiplication et la division ne sont plus précises pour
72des nombres v > vmax.
73
74Si on utilise la formule naïve avec des nombres double precision, alors
75\begin{eqnarray}
76den = c² + d² = 1² + (1e308)² = Inf
77\end{eqnarray}
78c'est à dire un overflow. Les termes e et f sont alors calculés par
79\begin{eqnarray}
80e = (ac + bd)/ den = (1*1 + 1*1e308)/Inf = 1e308/Inf = 0\\
81f = (bc - ad)/ den = (1*1 - 1*1e308)/Inf = -1e308/Inf = 0
82\end{eqnarray}
83Le résultat est alors faux sur le plan mathématique, mais surtout, il
84est imprécis sur le plan numérique, dans la mesure où aucun des
85nombres initiaux n'était supérieur à vmax.
86
87On peut également montrer que la formule naïve peut générer
88des underflow.
89
90Supposons que l'on veuille calculer la division suivante :
91\begin{eqnarray}
92\frac{1 + I * 1}{1e-308 + I * 1e-308}= 1e308
93\end{eqnarray}
94
95Utilisons la formule mathématique, naïve, pour vérifier cette division.
96
97\begin{eqnarray}
98den &=& c² + d² = (1e-308)² + (1e-308)² = 1e-616 + 1e-616 = 2e-616 \\
99e &=& (ac + bd)/ den = (1*1e-308 + 1*1e-308)/2e-616 ~ 2e-308/2e-616 ~ 1e308 \\
100f &=& (bc - ad)/ den = (1*1e-308 - 1*1e-308)/2e-616 ~ 0/1e-616 ~ 0
101\end{eqnarray}
102 39
103Avec des nombres double precision, le calcul ne se passe pas ainsi. 40The following Scilab function is a straitforward implementation
104Le dénominateur va provoquer un underflow et va être numériquement 41of the previous formulas.
105mis à zéro, de telle sorte que
106 42
107\begin{eqnarray} 43\lstset{language=Scilab}
108den &=& c² + d² = (1e-308)² + (1e-308)² = 1e-616 + 1e-616 = 0 \\ 44\lstset{numbers=left}
109e &=& (ac + bd)/ den = (1*1e-308 + 1*1e-308)/0 ~ 2e-308/0 ~ Inf \\ 45\lstset{basicstyle=\footnotesize}
110f &=& (bc - ad)/ den = (1*1e-308 - 1*1e-308)/0 ~ 0/0 ~ NaN \\ 46\lstset{keywordstyle=\bfseries}
111\end{eqnarray} 47\begin{lstlisting}
112 48//
113\subsubsection{Scilab implementation} 49// naive --
114 50// Compute the complex division with a naive method.
115On peut expérimenter facilement dans la console Scilab v5.0.2, prenant soin 51//
116de choisir un cas test qui pose problème. 52function [cr,ci] = naive (ar , ai , br , bi )
53 den = br * br + bi * bi;
54 cr = (ar * br + ai * bi) / den;
55 ci = (ai * br - ar * bi) / den;
56endfunction
57\end{lstlisting}
58
59In the following script, one compares the naive implementation
60against the Scilab implementation with two cases.
61
62\lstset{language=Scilab}
63\lstset{numbers=left}
64\lstset{basicstyle=\footnotesize}
65\lstset{keywordstyle=\bfseries}
66\begin{lstlisting}
67 // Check that no obvious bug is in mathematical formula.
68 [cr ci] = naive ( 1.0 , 2.0 , 3.0 , 4.0 )
69 (1.0 + 2.0 * %i)/(3.0 + 4.0 * %i)
70 // Check that mathematical formula does not perform well
71 // when large number are used.
72 [cr ci] = naive ( 1.0 , 1.0 , 1.0 , 1.e307 )
73 (1.0 + 1.0 * %i)/(1.0 + 1.e307 * %i)
74\end{lstlisting}
75
76That prints out the following messages.
117 77
118\begin{verbatim} 78\begin{verbatim}
119a=1+%i*1 79--> // Check that no obvious bug is in mathematical formula.
120b=1+%i*1e308 80--> [cr ci] = naive ( 1.0 , 2.0 , 3.0 , 4.0 )
121a/b 81 ci =
82 0.08
83 cr =
84 0.44
85--> (1.0 + 2.0 * %i)/(3.0 + 4.0 * %i)
86 ans =
87 0.44 + 0.08i
88--> // Check that mathematical formula does not perform well
89--> // when large number are used.
90--> [cr ci] = naive ( 1.0 , 1.0 , 1.0 , 1.e307 )
91 ci =
92 0.
93 cr =
94 0.
95--> (1.0 + 1.0 * %i)/(1.0 + 1.e307 * %i)
96 ans =
97 1.000-307 - 1.000-307i
122\end{verbatim} 98\end{verbatim}
123 99
124Le test montre que l'implémentation dans Scilab correspond à celle 100The simple calculation confirms that there is no bug in the
125de Smith, puisque, dans ce cas, le résultat est correct : 101naive implementation. But differences are apprearing when
102large numbers are used. In the second case, the naive
103implementation does not give a single exact digit !
104
105To make more complete tests, the following script allows to
106compare the results of the naive and the Scilab methods.
107We use three kinds of relative errors
108\begin{enumerate}
109\item the relative error on the complex numbers, as a whole $e=\frac{|e - c|}{|e|}$,
110\item the relative error on the real part $e=\frac{|e_r - e_r|}{e_r}$,
111\item the relative error on the imaginary part $e=\frac{|e_i - e_i|}{e_i}$.
112\end{enumerate}
113
114\lstset{language=Scilab}
115\lstset{numbers=left}
116\lstset{basicstyle=\footnotesize}
117\lstset{keywordstyle=\bfseries}
118\begin{lstlisting}
119//
120// compare --
121// Compare 3 methods for complex division:
122// * naive method
123// * Smith method
124// * C99 method
125//
126function compare (ar, ai, br, bi, rr, ri)
127 printf("****************\n");
128 printf(" a = %10.5e + %10.5e * I\n" , ar , ai );
129 printf(" b = %10.5e + %10.5e * I\n" , br , bi );
130 [cr ci] = naive ( ar, ai, br, bi);
131 printf("Naive --> c = %10.5e + %10.5e * I\n" , cr , ci );
132 c = cr + %i * ci
133 r = rr + %i * ri;
134 error1 = abs(r - c)/abs(r);
135 if (rr==0.0) then
136 error2 = abs(rr - cr);
137 else
138 error2 = abs(rr - cr)/abs(rr);
139 end
140 if (ri==0.0) then
141 error3 = abs(ri - ci);
142 else
143 error3 = abs(ri - ci)/abs(ri);
144 end
145 printf(" e1 = %10.5e, e2 = %10.5e, e3 = %10.5e\n", error1, error2, error3);
146
147 a = ar + ai * %i;
148 b = br + bi * %i;
149 c = a/b;
150 cr = real(c);
151 ci = imag(c);
152 printf("Scilab --> c = %10.5e + %10.5e * I\n" , cr , ci );
153 c = cr + %i * ci
154 error1 = abs(r - c)/abs(r);
155 if (rr==0.0) then
156 error2 = abs(rr - cr);
157 else
158 error2 = abs(rr - cr)/abs(rr);
159 end
160 if (ri==0.0) then
161 error3 = abs(ri - ci);
162 else
163 error3 = abs(ri - ci)/abs(ri);
164 end
165 printf(" e1 = %10.5e, e2 = %10.5e, e3 = %10.5e\n", error1, error2, error3);
166endfunction
167\end{lstlisting}
168
169In the following script, we compare the naive and the Scilab
170implementations of the complex division with 4 couples of
171complex numbers. The first instruction "ieee(2)" configures the
172IEEE system so that Inf and Nan numbers are generated instead
173of Scilab error messages.
174
175\lstset{language=Scilab}
176\lstset{numbers=left}
177\lstset{basicstyle=\footnotesize}
178\lstset{keywordstyle=\bfseries}
179\begin{lstlisting}
180ieee(2);
181// Check that naive implementation does not have a bug
182ar = 1;
183ai = 2;
184br = 3;
185bi = 4;
186rr = 11/25;
187ri = 2/25;
188compare (ar, ai, br, bi, rr, ri);
189
190// Check that naive implementation is not robust with respect to overflow
191ar = 1;
192ai = 1;
193br = 1;
194bi = 1e307;
195rr = 1e-307;
196ri = -1e-307;
197compare (ar, ai, br, bi, rr, ri);
198
199// Check that naive implementation is not robust with respect to underflow
200ar = 1;
201ai = 1;
202br = 1e-308;
203bi = 1e-308;
204rr = 1e308;
205ri = 0.0;
206compare (ar, ai, br, bi, rr, ri);
207
208\end{lstlisting}
209
210The script then prints out the following messages.
126 211
127\begin{verbatim} 212\begin{verbatim}
128-->a=1+%i*1 213****************
129 a = 214 a = 1.00000e+000 + 2.00000e+000 * I
130 215 b = 3.00000e+000 + 4.00000e+000 * I
131 1. + i 216Naive --> c = 4.40000e-001 + 8.00000e-002 * I
132 217 e1 = 0.00000e+000, e2 = 0.00000e+000, e3 = 0.00000e+000
133-->b=1+%i*1e308 218Scilab --> c = 4.40000e-001 + 8.00000e-002 * I
134 b = 219 e1 = 0.00000e+000, e2 = 0.00000e+000, e3 = 0.00000e+000
135 220****************
136 1. + 1.000+308i 221 a = 1.00000e+000 + 1.00000e+000 * I
137 222 b = 1.00000e+000 + 1.00000e+307 * I
138-->a/b 223Naive --> c = 0.00000e+000 + -0.00000e+000 * I
139 ans = 224 e1 = 1.00000e+000, e2 = 1.00000e+000, e3 = 1.00000e+000
140 225Scilab --> c = 1.00000e-307 + -1.00000e-307 * I
141 1.000-308 - 1.000-308i 226 e1 = 2.09614e-016, e2 = 1.97626e-016, e3 = 1.97626e-016
227****************
228 a = 1.00000e+000 + 1.00000e+000 * I
229 b = 1.00000e-308 + 1.00000e-308 * I
230Naive --> c = Inf + Nan * I
231 e1 = Nan, e2 = Inf, e3 = Nan
232Scilab --> c = 1.00000e+308 + 0.00000e+000 * I
233 e1 = 0.00000e+000, e2 = 0.00000e+000, e3 = 0.00000e+000
142\end{verbatim} 234\end{verbatim}
143 235
144Si on effectue le calcul pas à pas, on trouve que le 236The case \#2 and \#3 shows very surprising results.
145code utilisé correspond au code source de "wwdiv" du module 237With case \#2, the relative errors shows that the naive
146elementary\_functions : 238implementation does not give any correct digits.
239In case \#3, the naive implementation produces Nan and Inf results.
240In both cases, the Scilab command "/" gives accurate results, i.e.,
241with at least 16 significant digits.
147 242
148\begin{verbatim} 243\subsection{Explanations}
149subroutine wwdiv(ar, ai, br, bi, cr, ci, ierr)
150\end{verbatim}
151 244
152qui implémente la formule de Smith. 245In this section, we analyse the reason why the naive implementation
153Or, comme on l'a vu, l'algorithme de Smith possède une robustesse limitée. 246of the complex division leads to unaccurate results.
247In the first section, we perform algebraic computations
248and shows the problems of the naive formulas.
249In the second section, we present the Smith's method.
154 250
155On peut également tester les limites de la méthode de Smith, en 251\subsubsection{Algebraic computations}
156expérimentant de la manière suivante :
157 252
158\begin{verbatim} 253Let's analyse the second test and check the division of test \#2 :
159-->a = 1e307 + %i * 1e-307
160 a =
161 1.000+307 + 1.000-307i
162-->b = 1e205 + %i * 1e-205
163 b =
164 1.000+205 + 1.000-205i
165-->a/b
166 ans =
167 1.000+102
168\end{verbatim}
169 254
170Cette réponse est fausse, mais correspond bien au résultat 255\begin{eqnarray}
171de la méthode de Smith. 256\frac{1 + I}{1 + 10^{307} I } = 10^{307} - I * 10^{-307}
257\end{eqnarray}
172 258
173Par ailleurs, il est intéressant de constater que la procédure wwdiv n'est 259The naive formulas leads to the following results.
174pas utilisée systématiquement dans Scilab.
175En effet, on peut trouver des sections de code utilisant l'implémentation
176de la division complexe fournie par le compilateur.
177 260
178Par exemple, la fonction lambda = spec(A,B) calcule les valeurs propres 261\begin{eqnarray}
179généralisées des matrices complexes A et B. L'interface vers la fonction 262den &=& c^2 + d^2 = 1^2 + (10^{307})^2 = 1 + 10^{614} \approx 10^{614} \\
180zggev de Lapack est implémentée dans intzggev, dans laquelle on peut trouver 263e &=& (ac + bd)/ den = (1*1 + 1*10^{307})/1e614 \approx 10^{307}/10^{614} \approx 10^{-307}\\
181les lignes suivantes 264f &=& (bc - ad)/ den = (1*1 - 1*10^{307})/1e614 \approx -10^{307}/10^{614} \approx -10^{-307}
265\end{eqnarray}
182 266
183\begin{verbatim} 267To understand what happens with the naive implementation, one should
184 do 15 i = 1, N 268focus on the intermediate numbers.
185 zstk(lALPHA-1+i)=zstk(lALPHA-1+i)/zstk(lBETA-1+i) 269If one uses the naive formula with double precision numbers, then
186 15 continue
187\end{verbatim}
188 270
189Ce morceau de code permet de stocker dans le tableau complex 271\begin{eqnarray}
190associé à la variable alpha le résultat de la division de alpha/beta. 272den = c^2 + d^2 = 1^2 + (10^{307})^2 = Inf
191Comme les deux opérandes sont de type complexe, c'est le compilateur 273\end{eqnarray}
192fortran qui implémente la division (et, bien sûr, sans utiliser wwdiv).
193 274
275This generates an overflow, because $(10^{307})^2 = 10^{614}$ is not representable
276as a double precision number.
194 277
195\subsubsection{Fortran code} 278The $e$ and $f$ terms are then computed as
196 279
197Le code fortran suivant permet d'illustrer l'ensemble des points 280\begin{eqnarray}
198présentés précédement. On le test avec le compilateur Intel Fortran 10.1. 281e = (ac + bd)/ den = (1*1 + 1*10^{307})/Inf = 10^{307}/Inf = 0\\
282f = (bc - ad)/ den = (1*1 - 1*10^{307})/Inf = -10^{307}/Inf = 0
283\end{eqnarray}
199 284
200\begin{verbatim} 285The result is then computed without any single correct digit,
286even though the initial numbers are all representable as double precision
287numbers.
201 288
202 program rndof 289Let us check that the case \#3 is associated with an underflow.
203 complex*16 a 290We want to compute the following complex division :
204 complex*16 b
205 complex*16 c
206 double precision ar
207 double precision ai
208 double precision br
209 double precision bi
210 double precision cr
211 double precision ci
212c Check that naive implementation does not have a bug
213 ar = 1.d0
214 ai = 2.d0
215 br = 3.d0
216 bi = 4.d0
217 call compare(ar, ai, br, bi)
218c Check that naive implementation is not robust with respect to overflow
219 ar = 1.d0
220 ai = 1.d0
221 br = 1.d0
222 bi = 1.d308
223 call compare(ar, ai, br, bi)
224c Check that naive implementation is not robust with respect to underflow
225 ar = 1.d0
226 ai = 1.d0
227 br = 1.d-308
228 bi = 1.d-308
229 call compare(ar, ai, br, bi)
230c Check that Smith implementation is not robust with respect to complicated underflow
231 ar = 1.d307
232 ai = 1.d-307
233 br = 1.d205
234 bi = 1.d-205
235 call compare(ar, ai, br, bi)
236 end
237
238 subroutine naive(ar, ai, br, bi, cr, ci)
239 double precision, intent(in) :: ar
240 double precision, intent(in) :: ai
241 double precision, intent(in) :: br
242 double precision, intent(in) :: bi
243 double precision, intent(out) :: cr
244 double precision, intent(out) :: ci
245 double precision den
246 den = br * br + bi * bi
247 cr = (ar * br + ai * bi) / den
248 ci = (ai * br - ar * bi) / den
249 end
250
251 subroutine smith(ar, ai, br, bi, cr, ci)
252 double precision, intent(in) :: ar
253 double precision, intent(in) :: ai
254 double precision, intent(in) :: br
255 double precision, intent(in) :: bi
256 double precision, intent(out) :: cr
257 double precision, intent(out) :: ci
258 double precision den
259 if (abs(br) .ge. abs(bi)) then
260 r = bi / br
261 den = br + r*bi
262 cr = (ar + ai*r) / den
263 ci = (ai - ar*r) / den
264 else
265 r = br / bi
266 den = bi + r*br
267 cr = (ar*r + ai) / den
268 ci = (ai*r - ar) / den
269 endif
270 end
271
272 subroutine compare(ar, ai, br, bi)
273 double precision, intent(in) :: ar
274 double precision, intent(in) :: ai
275 double precision, intent(in) :: br
276 double precision, intent(in) :: bi
277 complex*16 a
278 complex*16 b
279 complex*16 c
280 double precision cr
281 double precision ci
282 print *, "****************"
283 call naive(ar, ai, br, bi, cr, ci)
284 print *, "Naive :", cr, ci
285 call smith(ar, ai, br, bi, cr, ci)
286 print *, "Smith :", cr, ci
287 a= dcmplx(ar, ai)
288 b = dcmplx(br, bi)
289 c = a/b
290 print * , "Fortran:", c
291 end
292
293\end{verbatim}
294 291
295Si on compile ce code avec Intel Fortran 10.1, on obtient 292\begin{eqnarray}
296l'affichage suivant dans la console. 293\frac{1 + I}{10^{-308} + 10^{-308} I}= 10^{308}
294\end{eqnarray}
297 295
298\begin{verbatim} 296The naive mathematical formula gives
299 ****************
300 c naive: 0.440000000000000 8.000000000000000E-002
301 c Smith: 0.440000000000000 8.000000000000000E-002
302 c Fortran: (0.440000000000000,8.000000000000000E-002)
303 ****************
304 c naive: 0.000000000000000E+000 0.000000000000000E+000
305 c Smith: 9.999999999999999E-309 -9.999999999999999E-309
306 c Fortran: (9.999999999999999E-309,-9.999999999999999E-309)
307 ****************
308 c naive: Infinity NaN
309 c Smith: 1.000000000000000E+308 0.000000000000000E+000
310 c Fortran: (1.000000000000000E+308,0.000000000000000E+000)
311 ****************
312 c naive: 0.000000000000000E+000 0.000000000000000E+000
313 c Smith: 1.000000000000000E+102 0.000000000000000E+000
314 c Fortran: (9.999999999999999E+101,-9.999999999999999E-309)
315\end{verbatim}
316 297
317Le quatrième test montre que l'implémentation fournie par 298\begin{eqnarray}
318le compilateur Intel donne un résultat correct, bien que la 299den &=& c^2 + d^2 = (10^{-308})^2 + (10^{-308})^2 = 10^{-616}10^{-616} + 10^{-616} = 2 \times 10^{-616} \\
319méthode de Smith donne de mauvais résultats. 300e &=& (ac + bd)/ den = (1*10^{-308} + 1*10^{-308})/(2 \times 10^{-616}) \\
301 &\approx& (2 \times 10^{-308})/(2 \times 10^{-616}) \approx 10^{-308} \\
302f &=& (bc - ad)/ den = (1*10^{-308} - 1*10^{-308})/(2 \times 10^{-616}) \approx 0/10^{-616} \approx 0
303\end{eqnarray}
320 304
321\subsubsection{C Code} 305With double precision numbers, the computation is not performed this way.
306Terms which are lower than $10^{-308}$ are too small to be representable
307in double precision and will be reduced to 0 so that an underflow occurs.
322 308
323Le code C++ suivant illustre le traitement du problème. 309\begin{eqnarray}
324Il est fondé sur le type "double complex" et fonctionne avec le compilateur 310den &=& c^2 + d^2 = (10^{-308})^2 + (10^{-308})^2 = 10^{-616} + 10^{-616} = 0 \\
325Intel C 11.0 avec la configuration du standard C99 dans l'environnement Visual Studio. 311e &=& (ac + bd)/ den = (1*10^{-308} + 1*10^{-308})/0 \approx 2\times 10^{-308}/0 \approx Inf \\
312f &=& (bc - ad)/ den = (1*10^{-308} - 1*10^{-308})/0 \approx 0/0 \approx NaN \\
313\end{eqnarray}
326 314
327\begin{verbatim} 315\subsubsection{The Smith's method}
328#include <stdio.h>
329#include <math.h>
330#include <complex.h>
331 316
332// 317In this section, we analyse the Smith's method and present the detailed
333// naive -- 318steps of this algorithm in the cases \#2 and \#3.
334// Compute the complex division with a naive method.
335//
336void naive (double ar, double ai, double br, double bi, double * cr, double * ci)
337{
338 double den;
339 den = br * br + bi * bi;
340 *cr = (ar * br + ai * bi) / den;
341 *ci = (ai * br - ar * bi) / den;
342}
343 319
344// 320In Scilab, the algorithm which allows to perform the complex
345// smith -- 321division is done by the the \emph{wwdiv} routine, which implements the
346// Compute the complex division with Smith's method. 322Smith's method \cite{368661}, \cite{WhatEveryComputerScientist}.
347// 323The Smith's algorithm is based on normalization, which allow to
348void smith (double ar, double ai, double br, double bi, double * cr, double * ci) 324perform the division even if the terms are large.
349{
350 double den;
351 double r;
352 double abr;
353 double abi;
354 abr = fabs(br);
355 abi = fabs(bi);
356 if ( abr >= abi)
357 {
358 r = bi / br;
359 den = br + r*bi;
360 *cr = (ar + ai*r) / den;
361 *ci = (ai - ar*r) / den;
362 }
363 else
364 {
365 r = br / bi;
366 den = bi + r*br;
367 *cr = (ar*r + ai) / den;
368 *ci = (ai*r - ar) / den;
369 }
370}
371 325
372// 326The starting point of the method is the mathematical definition
373// compare --
374// Compare 3 methods for complex division:
375// * naive method
376// * Smith method
377// * C99 method
378//
379void compare (double ar, double ai, double br, double bi)
380{
381 double complex a;
382 double complex b;
383 double complex c;
384
385 double cr;
386 double ci;
387
388 printf("****************\n");
389
390 naive(ar, ai, br, bi, &cr, &ci);
391 printf("Naif --> c = %e + %e * I\n" , cr , ci );
392
393 smith(ar, ai, br, bi, &cr, &ci);
394 printf("Smith --> c = %e + %e * I\n" , cr , ci );
395
396 a = ar + ai*I;
397 b = br + bi*I;
398 c = a / b;
399 printf("C --> c = %e + %e * I\n" , creal(c) , cimag(c) );
400}
401
402
403int main(void)
404{
405 double ar;
406 double ai;
407 double br;
408 double bi;
409
410
411 // Check that naive implementation does not have a bug
412 ar = 1;
413 ai = 2;
414 br = 3;
415 bi = 4;
416 compare (ar, ai, br, bi);
417
418 // Check that naive implementation is not robust with respect to overflow
419 ar = 1;
420 ai = 1;
421 br = 1;
422 bi = 1e307;
423 compare (ar, ai, br, bi);
424
425 // Check that naive implementation is not robust with respect to underflow
426 ar = 1;
427 ai = 1;
428 br = 1e-308;
429 bi = 1e-308;
430 compare (ar, ai, br, bi);
431
432 // Check that Smith method is not robust in complicated cases
433 ar = 1e307;
434 ai = 1e-307;
435 br = 1e205;
436 bi = 1e-205;
437 compare (ar, ai, br, bi);
438
439 return 0;
440}
441\end{verbatim}
442 327
443Voici le résultat qui apparaît dans la console : 328\begin{eqnarray}
329\frac{a + ib}{c + id} = \frac{ac + bd}{c^2 + d^2} + i \frac{bc - ad}{c^2 + d^2}
330\end{eqnarray}
444 331
445\begin{verbatim} 332The method of Smith is based on the rewriting of this formula in
446**************** 333two different, but mathematically equivalent formulas. The basic
447Naif --> c = 4.400000e-001 + 8.000000e-002 * I 334trick is to make the terms $d/c$ or $c/d$ appear in the formulas.
448Smith --> c = 4.400000e-001 + 8.000000e-002 * I 335When $c$ is larger than $d$, the formula involving $d/c$ is used.
449C --> c = 4.400000e-001 + 8.000000e-002 * I 336Instead, when $d$ is larger than $c$, the formula involving $c/d$ is
450**************** 337used. That way, the intermediate terms in the computations rarely
451Naif --> c = 0.000000e+000 + -0.000000e+000 * I 338exceeds the overflow limits.
452Smith --> c = 1.000000e-307 + -1.000000e-307 * I
453C --> c = 1.000000e-307 + -1.000000e-307 * I
454****************
455Naif --> c = 1.#INF00e+000 + -1.#IND00e+000 * I
456Smith --> c = 1.000000e+308 + 0.000000e+000 * I
457C --> c = 1.000000e+308 + 0.000000e+000 * I
458****************
459Naif --> c = -1.#IND00e+000 + -0.000000e+000 * I
460Smith --> c = 1.000000e+102 + 0.000000e+000 * I
461C --> c = 1.000000e+102 + -1.000000e-308 * I
462\end{verbatim}
463 339
464Cela montre que l'implémentation des nombres complexes dans la librairie 340Indeed, the complex division formula can be written as
465fournie par Intel traite le problème de manière adéquate. 341\begin{eqnarray}
342\frac{a + ib}{c + id} = \frac{a + b(d/c)}{c + d(d/c)} + i \frac{b - a(d/c)}{c + d(d/c)} \\
343\frac{a + ib}{c + id} = \frac{a(c/d) + b}{c(d/c) + d} + i \frac{b(c/d) - a}{c(d/c) + d}
344\end{eqnarray}
466 345
467\subsection{Explanations} 346These formulas can be simplified as
468\subsubsection{The Smith's method}
469 347
470C'est pourquoi les auteurs de Scilab, qui ont lu \cite{WhatEveryComputerScientist}, 348\begin{eqnarray}
471ont implementé la formule de Smith \cite{368661} (mais ils citent 349\frac{a + ib}{c + id} = \frac{a + br}{c + dr} + i \frac{b - ar}{c + dr}, \qquad r = d/c \\
472Goldberg en référence, ce qui est une erreur) : 350\frac{a + ib}{c + id} = \frac{ar + b}{cr + d} + i \frac{br - a}{cr + d} , \qquad r = c/d
351\end{eqnarray}
352
353The Smith's method is based on the following algorithm.
473 354
474\begin{algorithmic} 355\begin{algorithmic}
475\IF {$( |d| <= |c| )$} 356\IF {$( |d| <= |c| )$}
@@ -485,107 +366,158 @@ Goldberg en référence, ce qui est une erreur) :
485\ENDIF 366\ENDIF
486\end{algorithmic} 367\end{algorithmic}
487 368
488Dans le cas $(1+i)/(1+1e308 i)$, la méthode de Smith donne 369As we are going to check immediately, the Smith's method
370performs very well in cases \#2 and \#3.
371
372In the case \#2 $\frac{1+i}{1+10^{-308} i}$, the Smith's method is
489 373
490\begin{verbatim} 374\begin{verbatim}
491si ( |1e308| <= |1| ) > test faux 375If ( |1e308| <= |1| ) > test false
492sinon 376Else
493 r = 1 / 1e308 = 0 377 r = 1 / 1e308 = 0
494 den = 1e308 + 0 * 1 = 1e308 378 den = 1e308 + 0 * 1 = 1e308
495 e = (1 * 0 + 1) / 1e308 = 1e-308 379 e = (1 * 0 + 1) / 1e308 =
496 f = (1 * 0 - 1) / 1e308 = -1e-308 380 f = (1 * 0 - 1) / 1e308 = -1e-308
497\end{verbatim} 381\end{verbatim}
498 382
499ce qui est le résultat correct. 383In the case \#3 $\frac{1+i}{10^{-308}+10^{-308} i}$, the Smith's method is
500
501Dans le cas $(1+i)/(1e-308+1e-308 i)$, la méthode de Smith donne
502 384
503\begin{verbatim} 385\begin{verbatim}
504si ( |1e-308| <= |1e-308| ) > test vrai 386If ( |1e-308| <= |1e-308| ) > test true
505 r = 1e-308 / 1e308 = 1 387 r = 1e-308 / 1e308 = 1
506 den = 1e-308 + 1 * 1e-308 = 2e308 388 den = 1e-308 + 1 * 1e-308 = 2e308
507 e = (1 + 1 * 1) / 2e308 = 1e308 389 e = (1 + 1 * 1) / 2e308 = 1e308
508 f = (1 - 1 * 1) / 2e308 = 0 390 f = (1 - 1 * 1) / 2e308 = 0
509\end{verbatim} 391\end{verbatim}
510 392
511ce qui est encore une fois le résultat correct. 393\subsection{One more step}
394
395In that section, we show the limitations of the Smith's method.
396
397Suppose that we want to perform the following division
398
399\begin{eqnarray}
400\frac{10^{307} + i 10^{-307}}{10^205 + i 10^{-205}} = 10^102 - i 10^-308
401\end{eqnarray}
402
403The following Scilab script allows to compare the naive implementation
404and Scilab's implementation based on Smith's method.
405
406\lstset{language=Scilab}
407\lstset{numbers=left}
408\lstset{basicstyle=\footnotesize}
409\lstset{keywordstyle=\bfseries}
410\begin{lstlisting}
411// Check that Smith method is not robust in complicated cases
412ar = 1e307;
413ai = 1e-307;
414br = 1e205;
415bi = 1e-205;
416rr = 1e102;
417ri = -1e-308;
418compare (ar, ai, br, bi, rr, ri);
419\end{lstlisting}
420
421When executed, the script produces the following output.
512 422
513Il s'avère que le calcul de Smith, écrit en 1962, fonctionne bien dans un
514certain nombre de situations. L'article [4] cite une analyse de
515Hough qui donne une borne sur l'erreur réalisée par
516le calcul.
517\begin{verbatim} 423\begin{verbatim}
518|zcomp - zref| <= eps |zref| 424****************
425 a = 1.00000e+307 + 1.00000e-307 * I
426 b = 1.00000e+205 + 1.00000e-205 * I
427Naive --> c = Nan + -0.00000e+000 * I
428 e1 = 0.00000e+000, e2 = Nan, e3 = 1.00000e+000
429Scilab --> c = 1.00000e+102 + 0.00000e+000 * I
430 e1 = 0.00000e+000, e2 = 0.00000e+000, e3 = 1.00000e+000
519\end{verbatim} 431\end{verbatim}
520 432
521\subsubsection{The limits of the Smith method} 433As expected, the naive method produces a Nan.
434More surprisingly, the Scilab output is also quite approximated.
435More specifically, the imaginary part is computed as zero, although
436we know that the exact result is $10^-308$, which is representable
437as a double precision number.
438The relative error based on the norm of the complex number is
439accurate ($e1=0.0$), but the relative error based on the imaginary
440part only is wrong ($e3=1.0$), without any correct digits.
522 441
523L'article \cite{214414} (1985) toutefois, fait la distinction 442The reference \cite{1667289} cites an analysis by Hough which
524entre la norme |zcomp - zref| et la valeur des 443gives a bound for the relative error produced by the Smith's method
525parties imaginaires et réelles. Il montre en particulier un
526exemple dans lequel la partie imaginaire est erronée.
527 444
528Supposons que m et n sont des entiers possédant les propriétés suivantes 445\begin{eqnarray}
529\begin{verbatim} 446|zcomp - zref| <= eps |zref|
530m >> 0 447\end{eqnarray}
531n >> 0 448
449The paper \cite{214414} (1985), though, makes a distinction between
450the norm $|zcomp - zref|$ and the relative error for the
451real and imaginary parts. It especially gives an example
452where the imaginary part is wrong.
453
454In the following paragraphs, we detail the derivation
455of an example inspired by \cite{214414}, but which
456shows the problem with double precision numbers (the example
457in \cite{214414} is based on an abstract machine with
458exponent range $\pm 99$).
459
460Suppose that $m,n$ are integers so that the following
461conditions are satisfied
462\begin{eqnarray}
463m >> 0\\
464n >> 0\\
532n >> m 465n >> m
533\end{verbatim} 466\end{eqnarray}
534 467
535On peut alors facilement démontrer que la division complexe 468One can easily proove that the complex division can be approximated
536suivante peut être approchée : 469as
470\begin{eqnarray}
471\frac{10^n + i 10^{-n}}{10^m + i 10^{-m}} &=&
472\frac{10^{n+m} + 10^{-(m+n)}}{10^{2m} + 10^{-2m}} +
473i \frac{10^{m-n} - 10^{n-m}}{10^{2m} + 10^{-2m}}
474\end{eqnarray}
537 475
538\begin{verbatim} 476Because of the above assumptions, that leads to the following
53910^n + i 10^-n 477approximation
540-------------- = 10^(n-m) - i 10^(n-3m) 478\begin{eqnarray}
54110^m + i 10^-m 479\frac{10^n + i 10^{-n}}{10^m + i 10^{-m}} \approx 10^{n-m} - i 10^{n-3m}
542\end{verbatim} 480\end{eqnarray}
481which is correct up to approximately several 100 digits.
543 482
544On décide alors de choisir les nombres n et m inférieurs à 308 mais de telle sorte 483One then consider $m,n<308$ but so that
545que 484\begin{eqnarray}
546\begin{verbatim}
547n - 3 m = -308 485n - 3 m = -308
548\end{verbatim} 486\end{eqnarray}
549 487
550Par exemple le couple m=205, n=307 satisfait les égalités précédentes 488For example, the couple $m=205$, $n=307$ satisfies all conditions.
551de telle sorte que 489That leads to the complex division
552 490
553\begin{verbatim} 491\begin{eqnarray}
55410^307 + i 10^-307 492\frac{10^{307} + i 10^{-307}}{10^{205} + i 10^{-205}} = 10^{102} - i 10^{-308}
555------------------ = 10^102 - i 10^-308 493\end{eqnarray}
55610^205 + i 10^-205
557\end{verbatim}
558 494
559Il est facile de voir que ce dernier cas met en défaut la 495It is easy to check that the naive implementation does not
560formulation naïve. 496proove to be accurate on that example.
561Il est plus surprenant de constater que ce cas met également 497We have already shown that the Smith's method is failing to
562en défaut la formule de Smith. En effet, les opérations suivantes 498produce a non zero imaginary part. Indeed, the steps of the
563sont alisées par la méthode de Smith 499Smith algorithm are the following
564 500
565\begin{verbatim} 501\begin{verbatim}
566si ( |1e-205| <= |1e205| ) > test vrai 502If ( |1e-205| <= |1e205| ) > test true
567 r = 1e-205 / 1e205 = 0 503 r = 1e-205 / 1e205 = 0
568 den = 1e205 + 0 * 1e-205 = 1e205 504 den = 1e205 + 0 * 1e-205 = 1e205
569 e = (10^307 + 10^-307 * 0) / 1e205 = 1e102 505 e = (10^307 + 10^-307 * 0) / 1e205 = 1e102
570 f = (10^-307 - 10^307 * 0) / 1e205 = 0 506 f = (10^-307 - 10^307 * 0) / 1e205 = 0
571\end{verbatim} 507\end{verbatim}
572 508
573On constate que la partie réelle est exacte tandis que la 509The real part is accurate, but the imaginary part has no
574partie imaginaire est fausse. On peut également vérifier 510correct digit. One can also check that the inequality $|zcomp - zref| <= eps |zref|$
575que le module du résultat est dominé par la partie réelle de 511is still true.
576telle sorte que l'inégalité |zcomp - zref| <= eps |zref| reste 512
577vérifiée. 513The limits of Smith's method have been reduced in Stewart's paper \cite{214414}.
578 514The new algorithm is based on the theorem which states that if $x_1 \ldots x_n$
579Les limites de la méthode de Smith ont étées levées dans \cite{214414}. 515are $n$ floating point representable numbers then $\min_{i=1,n}(x_i) \max_{i=1,n}(x_i)$
580L'algorithme proposé est fondé sur une proposition qui démontre 516is also representable. The algorithm uses that theorem to perform a
581que si n nombres x1...xn sont représentables alors min(xi) * max(xi) 517correct computation.
582est également représentable. L'implémentation de 518
583la division complexe tire parti de cette proposition pour réaliser un 519Stewart's algorithm is superseded by the one by Li et Al \cite{567808}, but
584calcul correct. 520also by Kahan's \cite{KAHAN1987}, which, from \cite{1039814}, is the one implemented
585 521in the C99 standard.
586Il s'avère que l'algorithme de Stewart est dépassé par l'algorithme
587de Li et Al \cite{567808}, mais également par celui de Kahan \cite{KAHAN1987},
588qui, d'après \cite{1039814}, est
589similaire à celui implémenté dans le standard C99.
590 522
591 523
diff --git a/scilab_doc/scilabisnotnaive/introduction.tex b/scilab_doc/scilabisnotnaive/introduction.tex
index bda5c1b..a66ce2e 100644
--- a/scilab_doc/scilabisnotnaive/introduction.tex
+++ b/scilab_doc/scilabisnotnaive/introduction.tex
@@ -1,22 +1,14 @@
1\section{Introduction} 1\section{Introduction}
2 2
3There is a huge space between the numerical formula, as 3Scilab take cares with your numbers.
4presented in basic mathematical books, and the implementation 4While most mathematic books deals with exact formulas,
5of the numerical algorithm, and especially in Scilab. 5Scilab uses algorithms which are specifically designed for
6While the mathematic theory deals with the correctness 6computers.
7of the formulas, \emph{Scilab take cares with your numbers}.
8 7
9This difficulty is generated by the fact that, while 8The difficulty is generated by the fact that, while
10the mathematics treat with \emph{real} numbers, the 9the mathematics treat with \emph{real} numbers, the
11computer deals with their \emph{floating point representations}. 10computer deals with their \emph{floating point representations}.
12 11
13A central reference on this subject is the article
14by Goldberg, "What Every Computer Scientist Should Know About Floating-Point Arithmetic",
15\cite{WhatEveryComputerScientist}.
16If one focuses on numerical algorithms, the "Numerical Recipes" \cite{NumericalRecipes},
17is another good sources of solutions for that problem.
18The work of Kahan is also central in this domain, for example \cite{Kahan2004}.
19
20In this article, we will show examples of these problems by 12In this article, we will show examples of these problems by
21using the following theoric and experimental approach. 13using the following theoric and experimental approach.
22\begin{enumerate} 14\begin{enumerate}
@@ -46,6 +38,13 @@ be computed, and we then use the absolute error
46e_a=|x_c-x_e|. 38e_a=|x_c-x_e|.
47\end{eqnarray} 39\end{eqnarray}
48 40
41A central reference on this subject is the article
42by Goldberg, "What Every Computer Scientist Should Know About Floating-Point Arithmetic",
43\cite{WhatEveryComputerScientist}.
44If one focuses on numerical algorithms, the "Numerical Recipes" \cite{NumericalRecipes},
45is another good sources of solutions for that problem.
46The work of Kahan is also central in this domain, for example \cite{Kahan2004}.
47
49Before getting into the details, it is important to 48Before getting into the details, it is important to
50know that real variables in the Scilab language are stored in 49know that real variables in the Scilab language are stored in
51\emph{double precision} variables. Since Scilab is 50\emph{double precision} variables. Since Scilab is
diff --git a/scilab_doc/scilabisnotnaive/numericalderivative.tex b/scilab_doc/scilabisnotnaive/numericalderivative.tex
index 54f4e52..c95a3b3 100644
--- a/scilab_doc/scilabisnotnaive/numericalderivative.tex
+++ b/scilab_doc/scilabisnotnaive/numericalderivative.tex
@@ -309,11 +309,21 @@ a function of one variable is presented in figure \ref{robust-numericalderivativ
309 309
310\subsection{One step further} 310\subsection{One step further}
311 311
312As we can see, the \emph{derivative} command is based on 312In this section, we analyse the behaviour of \emph{derivative}
313floating point comprehension. Is it completely \emph{bulletproof} ? 313when the point $x$ is either large $x \rightarrow \infty$,
314Not exactly. 314when $x$ is small $x \rightarrow 0$ and when $x = 0$.
315 315We compare these results with the \emph{numdiff} command,
316See for example the following Scilab session. 316which does not use the same step strategy. As we are going
317to see, both commands performs the same when $x$ is near 1, but
318performs very differently when x is large or small.
319
320We have allready explained the theory of the floating
321point implementation of the \emph{derivative} command.
322Is it completely \emph{bulletproof} ? Not exactly.
323
324See for example the following Scilab session, where one computes the
325numerical derivative of $f(x)=x^2$ for $x=10^{-100}$. The
326expected result is $f'(x) = 2. \times 10^{-100}$.
317 327
318\begin{verbatim} 328\begin{verbatim}
319-->fp = derivative(myfunction,1.e-100,order=1) 329-->fp = derivative(myfunction,1.e-100,order=1)
@@ -327,12 +337,9 @@ See for example the following Scilab session.
327 7.450580596923828243D+91 337 7.450580596923828243D+91
328\end{verbatim} 338\end{verbatim}
329 339
330The exact answer is $x_e=2. \times 10^{-100}$ so that the 340The result does not have any significant digits.
331result does not have any significant digits.
332
333The additionnal experiment
334 341
335The explanation is that the step is computed with $h = \sqrt(eps)\approx 10^{-8}$. 342The explanation is that the step is computed with $h = \sqrt{eps}\approx 10^{-8}$.
336Then $f(x+h)=f(10^{-100} + 10^{-8}) \approx f(10^{-8}) = 10^{-16}$, because the 343Then $f(x+h)=f(10^{-100} + 10^{-8}) \approx f(10^{-8}) = 10^{-16}$, because the
337term $10^{-100}$ is much smaller than $10^{-8}$. The result of the 344term $10^{-100}$ is much smaller than $10^{-8}$. The result of the
338computation is therefore $(f(x+h) - f(x))/h = (10^{-16} + 10^{-200})/10^{-8} \approx 10^{-8}$. 345computation is therefore $(f(x+h) - f(x))/h = (10^{-16} + 10^{-200})/10^{-8} \approx 10^{-8}$.
@@ -345,10 +352,12 @@ The additionnal experiment
345 0.0000000149011611938477 352 0.0000000149011611938477
346\end{verbatim} 353\end{verbatim}
347 354
348explains the final result. 355allows to check that the result of the computation simply is $\sqrt{eps}$.
356That experiment shows that the \emph{derivative} command uses a
357wrong defaut step $h$ when $x$ is very small.
349 358
350To improve the accuracy of the computation, one can take control of the 359To improve the accuracy of the computation, one can take control of the
351step $h$. A reasonable solution is to use $h=\sqrt{\epsilon}*x$ so that the 360step $h$. A reasonable solution is to use $h=\sqrt{\epsilon}|x|$ so that the
352step is scaled depending on $x$. 361step is scaled depending on $x$.
353The following script illustrates than method, which produces 362The following script illustrates than method, which produces
354results with 8 significant digits. 363results with 8 significant digits.
@@ -369,10 +378,16 @@ But when $x$ is exactly zero, the scaling method cannot work, because
369it would produce the step $h=0$, and therefore a division by zero 378it would produce the step $h=0$, and therefore a division by zero
370exception. In that case, the default step provides a good accuracy. 379exception. In that case, the default step provides a good accuracy.
371 380
381Another command is available in Scilab to compute the
382numerical derivatives of a given function, that is \emph{numdiff}.
372The \emph{numdiff} command uses the step 383The \emph{numdiff} command uses the step
373\begin{eqnarray} 384\begin{eqnarray}
374h=\sqrt{\epsilon}(1+10^{-3}|x|) 385h=\sqrt{\epsilon}(1+10^{-3}|x|).
375\end{eqnarray} 386\end{eqnarray}
387In the following paragraphs, we try to analyse why this formula
388has been chosen. As we are going to check experimentally, this step
389formula performs better than \emph{derivative} when $x$ is
390large.
376 391
377As we can see the following session, the behaviour is approximately 392As we can see the following session, the behaviour is approximately
378the same when the value of $x$ is 1. 393the same when the value of $x$ is 1.
@@ -408,5 +423,6 @@ results, while \emph{derivative} performs poorly.
408 423
409This step is a trade-off because it allows to keep a good accuracy with 424This step is a trade-off because it allows to keep a good accuracy with
410large values of $x$, but produces a slightly sub-optimal step size when 425large values of $x$, but produces a slightly sub-optimal step size when
411$x$ is near 1. The behaviour near zero is the same. 426$x$ is near 1. The behaviour near zero is the same, i.e. both commands
427produce wrong results when $x \rightarrow 0$ and $x\neq 0$.
412 428
diff --git a/scilab_doc/scilabisnotnaive/quadratic.tex b/scilab_doc/scilabisnotnaive/quadratic.tex
index 536f829..122e80d 100644
--- a/scilab_doc/scilabisnotnaive/quadratic.tex
+++ b/scilab_doc/scilabisnotnaive/quadratic.tex
@@ -28,7 +28,7 @@ x_+ &=& \frac{-b+ \sqrt{b^2-4ac}}{2a}, \label{real:x+}
28with the hypothesis that the discriminant $\Delta=b^2-4ac$ 28with the hypothesis that the discriminant $\Delta=b^2-4ac$
29is positive. 29is positive.
30 30
31The naive, simplified, algorithm to use to compute the roots of the 31The naive, simplified, algorithm which computes the roots of the
32quadratic is presented in figure \ref{naive-quadratic}. 32quadratic is presented in figure \ref{naive-quadratic}.
33 33
34\begin{figure}[htbp] 34\begin{figure}[htbp]
@@ -124,7 +124,7 @@ Naive method : 9.094947e-009 (error=9.050530e-002)
124Scilab method : 1.000000e-008 (error=1.654361e-016) 124Scilab method : 1.000000e-008 (error=1.654361e-016)
125\end{verbatim} 125\end{verbatim}
126 126
127The result is astonishing, since the naive root has 127The result is surprising, since the naive root has
128no correct digit and a relative error which is 14 orders 128no correct digit and a relative error which is 14 orders
129of magnitude greater than the relative error of the Scilab root. 129of magnitude greater than the relative error of the Scilab root.
130 130
@@ -187,7 +187,7 @@ x_+ &\approx& -\epsilon \rightarrow 0^-, \qquad \epsilon \rightarrow 0
187To create a difficult case, we search $\epsilon$ so that 187To create a difficult case, we search $\epsilon$ so that
188$1/\epsilon^2 = 10^{310}$, because we know that $10^{308}$ 188$1/\epsilon^2 = 10^{310}$, because we know that $10^{308}$
189is the maximum value available with double precision floating 189is the maximum value available with double precision floating
190point numbers. The solution is $\epsilon=10^{-155}$. 190point numbers. One possible solution is $\epsilon=10^{-155}$.
191 191
192The following Scilab script shows an example of the computation 192The following Scilab script shows an example of the computation
193of the roots of such a polynomial with the \emph{roots} 193of the roots of such a polynomial with the \emph{roots}
diff --git a/scilab_doc/scilabisnotnaive/scilabisnotnaive.bib b/scilab_doc/scilabisnotnaive/scilabisnotnaive.bib
index cdf9518..28d3776 100644
--- a/scilab_doc/scilabisnotnaive/scilabisnotnaive.bib
+++ b/scilab_doc/scilabisnotnaive/scilabisnotnaive.bib
@@ -191,3 +191,15 @@ publisher= {Gauthier-Villars}}
191 } 191 }
192 192
193 193
194@ARTICLE{1667289,
195title={Underflow and the Denormalized Numbers},
196author={Coonen, J.T.},
197journal={Computer},
198year={1981},
199month={March },
200volume={14},
201number={3},
202pages={75-87},
203doi={10.1109/C-M.1981.220382},
204ISSN={0018-9162}, }
205
diff --git a/scilab_doc/scilabisnotnaive/scilabisnotnaive.pdf b/scilab_doc/scilabisnotnaive/scilabisnotnaive.pdf
index 2b5f8d0..1e41a48 100644
--- a/scilab_doc/scilabisnotnaive/scilabisnotnaive.pdf
+++ b/scilab_doc/scilabisnotnaive/scilabisnotnaive.pdf
Binary files differ