summaryrefslogtreecommitdiffstats
path: root/scilab_doc/scilabisnotnaive/quadratic.tex
blob: 1a888f9fb1c3a67f2cb55e4e781c31146f1ab5a4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
\section{Quadratic equation}

In this section, we detail the computation of the roots of a quadratic polynomial.
As we shall see, there is a whole world from the mathematics formulas to the 
implementation of such computations. In the first part, we briefly report the formulas which allow to 
compute the real roots of a quadratic equation with real coefficients.
We then present the na´ve algorithm based on these mathematical formulas. 
In the second part, we make some experiments in Scilab and compare our
na´ve algorithm with the \emph{roots} Scilab primitive.
In the third part, we analyse 
why and how floating point numbers must be taken into account when the 
implementation of such roots is required.

\subsection{Theory}

We consider the following quadratic equation, with real 
coefficients $a, b, c \in \RR$ \cite{wikipediaquadratic,wikipedialossofsign,mathworldquadratic} :

\begin{eqnarray}
a x^2 + b x + c = 0.
\end{eqnarray}

The real roots of the quadratic equations are
\begin{eqnarray}
x_- &=& \frac{-b- \sqrt{b^2-4ac}}{2a}, \label{real:x-}\\
x_+ &=& \frac{-b+ \sqrt{b^2-4ac}}{2a}, \label{real:x+}
\end{eqnarray}
with the hypothesis that the discriminant $\Delta=b^2-4ac$
is positive.

The naive, simplified, algorithm which computes the roots of the 
quadratic is presented in figure \ref{naive-quadratic}.

\begin{figure}[htbp]
\begin{algorithmic}
\STATE $\Delta\gets b^2-4ac$
\STATE $s\gets \sqrt{\Delta}$
\STATE $x_-\gets (-b-s)/(2a)$
\STATE $x_+\gets (-b+s)/(2a)$
\end{algorithmic}
\caption{Naive algorithm to compute the real roots of a quadratic equation}
\label{naive-quadratic}
\end{figure}

\subsection{Experiments}

The following Scilab function is a straitforward implementation
of the previous formulas.

\lstset{language=Scilab}
\lstset{numbers=left}
\lstset{basicstyle=\footnotesize}
\lstset{keywordstyle=\bfseries}
\begin{lstlisting}
function r=myroots(p)
  c=coeff(p,0);
  b=coeff(p,1);
  a=coeff(p,2);
  r=zeros(2,1);
  r(1)=(-b+sqrt(b^2-4*a*c))/(2*a);
  r(2)=(-b-sqrt(b^2-4*a*c))/(2*a);
endfunction
\end{lstlisting}

The goal of this section is to show that some additionnal
work is necessary to compute the roots of the quadratic equation
with sufficient accuracy.
We will especially pay attention to rounding errors and 
overflow problems.
In this section, we show that the \emph{roots} command 
of the Scilab language is not \emph{naive}, in the sense that it 
takes into account for the floating point implementation details 
that we will see in the next section.



\subsubsection{Rounding errors}

We analyse the rounding errors which are 
appearing when the discriminant of the quadratic equation 
is such that $b^2\approx 4ac$.
We consider the following quadratic equation 
\begin{eqnarray}
\epsilon x^2 + (1/\epsilon)x - \epsilon = 0
\end{eqnarray}
with $\epsilon=0.0001=10^{-4}$.

The two real solutions of the quadratic equation are
\begin{eqnarray}
x_- &=& \frac{-1/\epsilon- \sqrt{1/\epsilon^2+4\epsilon^2}}{2\epsilon} \approx  -1/\epsilon^2, \\
x_+ &=& \frac{-1/\epsilon+ \sqrt{1/\epsilon^2+4\epsilon^2}}{2\epsilon} \approx  \epsilon^2
\end{eqnarray}

The following Scilab script shows an example of the computation
of the roots of such a polynomial with the \emph{roots}
primitive and with a naive implementation.
Only the positive root $x_+ \approx \epsilon^2$ is considered in this 
test (the $x_-$ root is so that $x_- \rightarrow -\infty$ in both 
implementations).

\lstset{language=Scilab}
\lstset{numbers=left}
\lstset{basicstyle=\footnotesize}
\lstset{keywordstyle=\bfseries}
\begin{lstlisting}
p=poly([-0.0001 10000.0 0.0001],"x","coeff");
e1 = 1e-8;
roots1 = myroots(p);
r1 = roots1(1);
roots2 = roots(p);
r2 = roots2(1);
error1 = abs(r1-e1)/e1;
error2 = abs(r2-e1)/e1;
printf("Expected : %e\n", e1);
printf("Naive method : %e (error=%e)\n", r1,error1);
printf("Scilab method : %e (error=%e)\n", r2, error2);
\end{lstlisting}

The script then prints out :

\begin{verbatim}
Expected : 1.000000e-008
Naive method : 9.094947e-009 (error=9.050530e-002)
Scilab method : 1.000000e-008 (error=1.654361e-016)
\end{verbatim}

The result is surprising, since the naive root has 
no correct digit and a relative error which is 14 orders 
of magnitude greater than the relative error of the Scilab root.

The explanation for such a behaviour is that the expression of the 
positive root is the following 

\begin{eqnarray}
x_+ &=& \frac{-1/\epsilon+ \sqrt{1/\epsilon^2+4\epsilon^2}}{2\epsilon}
\end{eqnarray}

and is numerically evalutated as 

\begin{verbatim}
\sqrt{1/\epsilon^2+4\epsilon^2} = 10000.000000000001818989
\end{verbatim}

As we see, the first digits are correct, but the last digits 
are polluted with rounding errors. When the expression $-1/\epsilon+ \sqrt{1/\epsilon^2+4\epsilon^2}$
is evaluated, the following computations are performed~:

\begin{verbatim}
-1/\epsilon+ \sqrt{1/\epsilon^2+4\epsilon^2} 
  = -10000.0 + 10000.000000000001818989 
  = 0.0000000000018189894035
\end{verbatim}

The user may think that the result is extreme, but it 
is not. Reducing furter the value of $\epsilon$ down to 
$\epsilon=10^{-11}$, we get the following output :

\begin{verbatim}
Expected : 1.000000e-022
Naive method : 0.000000e+000 (error=1.000000e+000)
Scilab method : 1.000000e-022 (error=1.175494e-016)
\end{verbatim}

The relative error is this time 16 orders of magnitude 
greater than the relative error of the Scilab root.
In fact, the naive implementation computes a false root $x_+$ even for 
a value of epsilon equal to $\epsilon=10^-3$, where the relative 
error is 7 times greater than the relative error produced by the 
\emph{roots} primitive.

\subsubsection{Overflow}

In this section, we analyse the overflow exception which is  
appearing when the discriminant of the quadratic equation 
is such that $b^2>> 4ac$.
We consider the following quadratic equation 
\begin{eqnarray}
x^2 + (1/\epsilon)x + 1 = 0
\end{eqnarray}
with $\epsilon\rightarrow 0$.

The roots of this equation are 
\begin{eqnarray}
x_- &\approx& -1/\epsilon \rightarrow -\infty, \qquad \epsilon \rightarrow 0\\
x_+ &\approx& -\epsilon \rightarrow 0^-, \qquad \epsilon \rightarrow 0
\end{eqnarray}
To create a difficult case, we search $\epsilon$ so that 
$1/\epsilon^2 = 10^{310}$, because we know that $10^{308}$
is the maximum value available with double precision floating 
point numbers. One possible solution is $\epsilon=10^{-155}$.

The following Scilab script shows an example of the computation
of the roots of such a polynomial with the \emph{roots}
primitive and with a naive implementation.

\lstset{language=Scilab}
\lstset{numbers=left}
\lstset{basicstyle=\footnotesize}
\lstset{keywordstyle=\bfseries}
\begin{lstlisting}
// Test #3 : overflow because of b
e=1.e-155
a = 1;
b = 1/e;
c = 1;
p=poly([c b a],"x","coeff");
expected = [-e;-1/e];
roots1 = myroots(p);
roots2 = roots(p);
error1 = abs(roots1-expected)/norm(expected);
error2 = abs(roots2-expected)/norm(expected);
printf("Expected : %e %e\n", expected(1),expected(2));
printf("Naive method : %e %e (error=%e)\n", roots1(1),roots1(2),error1);
printf("Scilab method : %e %e (error=%e)\n", roots2(1),roots2(2), error2);
\end{lstlisting}

The script then prints out :

\begin{verbatim}
Expected : -1.000000e-155 -1.000000e+155
Naive method : Inf Inf (error=Nan)
Scilab method : -1.000000e-155 -1.000000e+155 (error=0.000000e+000)
\end{verbatim}

As we see, the $b^2-4ac$ term has been evaluated as $1/\epsilon^2-4$,
which is approximately equal to $10^{310}$. This number cannot 
be represented in a floating point number. It therefore produces the 
IEEE overflow exception and set the result as \emph{Inf}.

\subsection{Explanations}

The following tricks are extracted from the 
\emph{quad} routine of the \emph{RPOLY} algorithm by
Jenkins \cite{Jenkins1975}. This algorithm is used by Scilab in the 
roots primitive, where a special case is handled when the 
degree of the equation is equal to 2, i.e. a quadratic equation.

\subsubsection{Properties of the roots}

One can easily show that the sum and the product of the roots
allow to recover the coefficients of the equation which was solve.
One can show that 
\begin{eqnarray}
x_- + x_+ &=&\frac{-b}{a}\\
x_- x_+ &=&\frac{c}{a}
\end{eqnarray}
Put in another form, one can state that the computed roots are 
solution of the normalized equation 
\begin{eqnarray}
x^2 - \left(\frac{x_- + x_+}{a}\right) x  + x_- x_+ &=&0
\end{eqnarray}

Other transformation leads to an alternative form for the roots. 
The original quadratic equation can be written as a quadratic 
equation on $1/x$
\begin{eqnarray}
c(1/x)^2 + b (1/x)  + a &=&0
\end{eqnarray}
Using the previous expressions for the solution of $ax^2+bx+c=0$ leads to the 
following expression of the roots of the quadratic equation when the 
discriminant is positive 
\begin{eqnarray}
x_- &=& \frac{2c}{-b+ \sqrt{b^2-4ac}}, \label{real:x-inverse}\\
x_+ &=& \frac{2c}{-b- \sqrt{b^2-4ac}} \label{real:x+inverse}
\end{eqnarray}
These roots can also be computed from \ref{real:x-}, with the 
multiplication by $-b+ \sqrt{b^2-4ac}$.

\subsubsection{Conditionning of the problem}

The conditionning of the problem may be evaluated with the 
computation of the partial derivatives of the roots of the 
equations with respect to the coefficients.
These partial derivatives measure the sensitivity of the 
roots of the equation with respect to small errors which might 
pollute the coefficients of the quadratic equations.

In the following, we note $x_-=\frac{-b- \sqrt{\Delta}}{2a}$ 
and $x_+=\frac{-b+ \sqrt{\Delta}}{2a}$ when $a\neq 0$.
If the discriminant is stricly positive and $a\neq 0$, i.e. if the roots 
of the quadratic are real, the partial derivatives of the 
roots are the following :
\begin{eqnarray}
\frac{\partial x_-}{\partial a} &=& \frac{c}{a\sqrt{\Delta}} + \frac{b+\sqrt{\Delta}}{2a^2}, \qquad a\neq 0, \qquad \Delta\neq 0\\
\frac{\partial x_+}{\partial a} &=& -\frac{c}{a\sqrt{\Delta}} + \frac{b-\sqrt{\Delta}}{2a^2}\\
\frac{\partial x_-}{\partial b} &=& \frac{-1-b/\sqrt{\Delta}}{2a}\\
\frac{\partial x_+}{\partial b} &=& \frac{-1+b/\sqrt{\Delta}}{2a}\\
\frac{\partial x_-}{\partial c} &=& \frac{1}{\sqrt{\Delta}}\\
\frac{\partial x_+}{\partial c} &=& -\frac{1}{\sqrt{\Delta}}
\end{eqnarray}

If the discriminant is zero, the partial derivatives of the 
double real root are the following :
\begin{eqnarray}
\frac{\partial x_\pm}{\partial a} &=& \frac{b}{2a^2}, \qquad a\neq 0\\
\frac{\partial x_\pm}{\partial b} &=& \frac{-1}{2a}\\
\frac{\partial x_\pm}{\partial c} &=& 0
\end{eqnarray}

The partial derivates indicate that if $a\approx 0$ or $\Delta\approx 0$,
the problem is ill-conditionned. 



\subsubsection{Floating-Point implementation : fixing rounding error}

In this section, we show how to compute the roots of a 
quadratic equation with protection against rounding 
errors, protection against overflow and a minimum 
amount of multiplications and divisions.

Few but important references deals with floating point
implementations of the roots of a quadratic polynomial.
These references include the important paper \cite{WhatEveryComputerScientist} by Golberg, 
the Numerical Recipes \cite{NumericalRecipes}, chapter 5, section 5.6
and \cite{FORSYTHE1991}, \cite{Nievergelt2003}, \cite{Kahan2004}.

The starting point is the mathematical solution of the quadratic equation, 
depending on the sign of the discriminant $\Delta=b^2 - 4ac$ :
\begin{itemize}
\item If $\Delta> 0$, there are two real roots, 
\begin{eqnarray}
x_\pm &=& \frac{-b\pm \sqrt{\Delta}}{2a}, \qquad a\neq 0
\end{eqnarray}
\item If $\Delta=0$, there are one double root,
\begin{eqnarray}
x_\pm &=& -\frac{b}{2a}, \qquad a\neq 0
\end{eqnarray}
\item If $\Delta< 0$, 
\begin{eqnarray}
x_\pm &=&\frac{-b}{2a} \pm i \frac{\sqrt{-\Delta}}{2a}, \qquad a\neq 0
\end{eqnarray}
\end{itemize}


In the following, we make the hypothesis that $a\neq 0$.

The previous experiments suggest that the floating point implementation
must deal with two different problems :
\begin{itemize}
\item rounding errors when $b^2\approx 4ac$ because of the cancelation of the 
terms which have opposite signs,
\item overflow in the computation of the discriminant $\Delta$ when $b$ is 
large in magnitude with respect to $a$ and $c$.
\end{itemize}

When $\Delta>0$, the rounding error problem can be splitted in two cases
\begin{itemize}
\item if $b<0$, then $-b+\sqrt{b^2-4ac}$ may suffer of rounding errors,
\item if $b>0$, then $-b-\sqrt{b^2-4ac}$ may suffer of rounding errors.
\end{itemize}
 
Obviously, the rounding problem will not appear when $\Delta<0$,
since the complex roots do not use the sum $-b+\sqrt{b^2-4ac}$.
When $\Delta=0$, the double root does not cause further trouble.
The rounding error problem must be solved only when $\Delta>0$ and the 
equation has two real roots.

A possible solution may found in combining the following expressions for the 
roots 
\begin{eqnarray}
x_- &=& \frac{-b- \sqrt{b^2-4ac}}{2a}, \label{real:x-2}\\
x_- &=& \frac{2c}{-b+ \sqrt{b^2-4ac}}, \label{real:x-inverse2}\\
x_+ &=& \frac{-b+ \sqrt{b^2-4ac}}{2a}, \label{real:x+2}\\
x_+ &=& \frac{2c}{-b- \sqrt{b^2-4ac}} \label{real:x+inverse2}
\end{eqnarray}

The trick is to pick the formula so that the sign of $b$ is the 
same as the sign of the square root.

The following choice allow to solve the rounding error problem 
\begin{itemize}
\item compute $x_-$ : if $b<0$, then compute $x_-$ from \ref{real:x-inverse2}, else 
(if $b>0$), compute $x_-$ from \ref{real:x-2},
\item compute $x_+$ : if $b<0$, then compute $x_+$ from \ref{real:x+2}, else 
(if $b>0$), compute $x_+$ from \ref{real:x+inverse2}.
\end{itemize}

The solution of the rounding error problem  can be adressed, by considering the 
modified Fagnano formulas
\begin{eqnarray}
x_1 &=& -\frac{2c}{b+sgn(b)\sqrt{b^2-4ac}}, \\
x_2 &=& -\frac{b+sgn(b)\sqrt{b^2-4ac}}{2a}, 
\end{eqnarray}
where 
\begin{eqnarray}
sgn(b)=\left\{\begin{array}{l}
1, \textrm{ if } b\geq 0,\\
-1, \textrm{ if } b< 0,
\end{array}\right.
\end{eqnarray}
The roots $x_{1,2}$ correspond to $x_{+,-}$ so that if $b<0$, $x_1=x_-$ and
if $b>0$, $x_1=x_+$. On the other hand, if $b<0$, $x_2=x_+$ and
if $b>0$, $x_2=x_-$.

An additionnal remark is that the division by two (and the multiplication
by 2) is exact with floating point numbers so these operations
cannot be a source of problem. But it is 
interesting to use $b/2$, which involves only one division, instead
of the three multiplications $2*c$, $2*a$ and $4*a*c$.
This leads to the following expressions of the real roots 
\begin{eqnarray}
x_- &=& -\frac{c}{(b/2)+sgn(b)\sqrt{(b/2)^2-ac}}, \\
x_+ &=& -\frac{(b/2)+sgn(b)\sqrt{(b/2)^2-ac}}{a}, 
\end{eqnarray}
which can be simplified into
\begin{eqnarray}
b'&=&b/2\\
h&=& -\left(b'+sgn(b)\sqrt{b'^2-ac}\right)\\
x_1 &=& \frac{c}{h}, \\
x_2 &=& \frac{h}{a}, 
\end{eqnarray}
where the discriminant is positive, i.e. $b'^2-ac>0$.

One can use the same value $b'=b/2$ with the complex roots in the 
case where the discriminant is negative, i.e. $b'^2-ac<0$ :
\begin{eqnarray}
x_1 &=& -\frac{b'}{a} - i \frac{\sqrt{ac-b'^2}}{a}, \\
x_2 &=& -\frac{b'}{a} + i \frac{\sqrt{ac-b'^2}}{a}, 
\end{eqnarray}

A more robust algorithm, based on the previous analysis is presented in figure \ref{robust-quadratic}.
By comparing \ref{naive-quadratic} and \ref{robust-quadratic}, we can see that 
the algorithms are different in many points.

\begin{figure}[htbp]
\begin{algorithmic}
\IF {$a=0$}
        \IF {$b=0$}
            \STATE $x_-\gets 0$
            \STATE $x_+\gets 0$
        \ELSE
            \STATE $x_-\gets -c/b$
            \STATE $x_+\gets 0$
        \ENDIF
\ELSIF {$c=0$}
        \STATE $x_-\gets -b/a$
        \STATE $x_+\gets 0$        
\ELSE
        \STATE $b'\gets b/2$
        \STATE $\Delta\gets b'^2 - ac$
        \IF {$\Delta<0$}
                \STATE $s\gets \sqrt{-\Delta}$
                \STATE $x_1^R\gets -b'/a$
                \STATE $x_1^I\gets -s/a$
                \STATE $x_2^R\gets x_-^R$
                \STATE $x_2^I\gets -x_1^I$
        \ELSIF {$\Delta=0$}
                \STATE $x_1\gets -b'/a$
                \STATE $x_2\gets x_2$
        \ELSE
                \STATE $s\gets \sqrt{\Delta}$
                \IF {$b>0$}
                    \STATE $g=1$
                \ELSE
                    \STATE $g=-1$
                \ENDIF
                \STATE $h=-(b'+g*s)$
                \STATE $x_1\gets c/h$
                \STATE $x_2\gets h/a$
        \ENDIF
\ENDIF 
\end{algorithmic}
\caption{A more robust algorithm to compute the roots of a quadratic equation}
\label{robust-quadratic}
\end{figure}

\subsubsection{Floating-Point implementation : fixing overflow problems}

The remaining problem is to compute $b'^2-ac$ without creating 
unnecessary overflows. 

Notice that a small improvment
has allread been done : if $|b|$ is close to the upper bound $10^{154}$, 
then $|b'|$ may be less difficult to process since $|b'|=|b|/2 < |b|$.
One can then compute the square root by using normalization methods, 
so that the overflow problem can be drastically reduced.
The method is based on the fact that the term $b'^2-ac$ can be 
evaluted with two equivalent formulas
\begin{eqnarray}
b'^2-ac &=& b'^2\left[1-(a/b')(c/b')\right] \\
b'^2-ac &=& c\left[b'(b'/c) - a\right]
\end{eqnarray}

\begin{itemize}
\item If $|b'|>|c|>0$, then the expression involving $\left(1-(a/b')(c/b')\right)$
is so that no overflow is possible since $|c/b'| < 1$ and the problem occurs
only when $b$ is large in magnitude with respect to $a$ and $c$.
\item If $|c|>|b'|>0$, then the expression involving $\left(b'(b'/c) - a\right)$
should limit the possible overflows since $|b'/c| < 1$.
\end{itemize}
These normalization tricks are similar to the one used by Smith in the 
algorithm for the division of complex numbers \cite{Smith1962}.

\subsection{References}

The 1966 technical report by G. Forsythe \cite{Forsythe1966} 
presents the floating point system and the possible large error 
in using mathematical algorithms blindly. An accurate way of solving 
a quadratic is outlined. A few general remarks are made about 
computational mathematics. The 1991 paper by Goldberg 
\cite{WhatEveryComputerScientist} is a general presentation of the floating
point system and its consequences. It begins with background on floating point 
representation and rounding errors, continues with a discussion
of the IEEE floating point standard and concludes with examples of how
computer system builders can better support floating point. The section
1.4, "Cancellation" specificaly consider the computation of the roots
of a quadratic equation.
One can also consult the experiments performed by Nievergelt in \cite{Nievergelt2003}.