diff options
author | Michaël Baudin <michael.baudin@scilab.org> | 2009-09-30 11:33:43 +0200 |
---|---|---|
committer | Michaël Baudin <michael.baudin@scilab.org> | 2009-09-30 11:33:43 +0200 |
commit | 00bde02012a3d3754e2d009f4c7fef14fb36164f (patch) | |
tree | 22e134db1ae6c7c78e3428f3c2f9235ec593ed44 /scilab_doc | |
parent | c42ef16122f4d9aa75ea453d97f7fc2d9de28c03 (diff) | |
download | scilab-00bde02012a3d3754e2d009f4c7fef14fb36164f.zip scilab-00bde02012a3d3754e2d009f4c7fef14fb36164f.tar.gz |
Fixed problems in intro and simplex. Added figures for help and demos.
Diffstat (limited to 'scilab_doc')
-rw-r--r-- | scilab_doc/neldermead/introduction-demos.png | bin | 0 -> 37427 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/introduction-help-fminsearch.png | bin | 0 -> 41211 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/introduction-help.png | bin | 0 -> 62814 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/introduction.tex | 315 | ||||
-rw-r--r-- | scilab_doc/neldermead/method-neldermead.tex | 36 | ||||
-rw-r--r-- | scilab_doc/neldermead/method-spendley.tex | 10 | ||||
-rw-r--r-- | scilab_doc/neldermead/neldermead-introduction-so.pdf | bin | 0 -> 286702 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/neldermead-introduction-so.tex | 85 | ||||
-rw-r--r-- | scilab_doc/neldermead/neldermead-simplex-so.pdf | bin | 181032 -> 184133 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/neldermead-spendley-so.pdf | bin | 317287 -> 323141 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/neldermead.bib | 1 | ||||
-rw-r--r-- | scilab_doc/neldermead/neldermead.pdf | bin | 971928 -> 983277 bytes | |||
-rw-r--r-- | scilab_doc/neldermead/neldermead.tex | 51 | ||||
-rw-r--r-- | scilab_doc/neldermead/overview.tex | 74 | ||||
-rw-r--r-- | scilab_doc/neldermead/section-simplex.tex | 95 |
15 files changed, 500 insertions, 167 deletions
diff --git a/scilab_doc/neldermead/introduction-demos.png b/scilab_doc/neldermead/introduction-demos.png new file mode 100644 index 0000000..2cd8327 --- /dev/null +++ b/scilab_doc/neldermead/introduction-demos.png | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/introduction-help-fminsearch.png b/scilab_doc/neldermead/introduction-help-fminsearch.png new file mode 100644 index 0000000..58fe455 --- /dev/null +++ b/scilab_doc/neldermead/introduction-help-fminsearch.png | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/introduction-help.png b/scilab_doc/neldermead/introduction-help.png new file mode 100644 index 0000000..2740b7b --- /dev/null +++ b/scilab_doc/neldermead/introduction-help.png | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/introduction.tex b/scilab_doc/neldermead/introduction.tex index 303ef25..11b20ef 100644 --- a/scilab_doc/neldermead/introduction.tex +++ b/scilab_doc/neldermead/introduction.tex | |||
@@ -1,6 +1,12 @@ | |||
1 | \chapter{Introduction} | 1 | \chapter{Introduction} |
2 | 2 | ||
3 | The Nelder-Mead simplex algorithm, published in 1965, is an enormously | 3 | In this introductory chapter, we make an overview of simplex-based algorithms. |
4 | We present the main features of the \scifunction{neldermead} component, and | ||
5 | show how to use the component with a simple example. | ||
6 | |||
7 | \section{Overview} | ||
8 | |||
9 | The Nelder-Mead simplex algorithm \cite{citeulike:3009487}, published in 1965, is an enormously | ||
4 | popular search method for multidimensional unconstrained optimization. | 10 | popular search method for multidimensional unconstrained optimization. |
5 | The Nelder-Mead algorithm should not be confused with the (probably) | 11 | The Nelder-Mead algorithm should not be confused with the (probably) |
6 | more famous simplex algorithm of Dantzig for linear programming. The | 12 | more famous simplex algorithm of Dantzig for linear programming. The |
@@ -8,7 +14,7 @@ Nelder-Mead algorithm is especially popular in the fields of chemistry, | |||
8 | chemical engineering, and medicine. Two measures of the ubiquity of the | 14 | chemical engineering, and medicine. Two measures of the ubiquity of the |
9 | Nelder-Mead algorithm are that it appears in the best-selling handbook | 15 | Nelder-Mead algorithm are that it appears in the best-selling handbook |
10 | Numerical Recipes and in Matlab. In \cite{Torczon89multi-directionalsearch}, | 16 | Numerical Recipes and in Matlab. In \cite{Torczon89multi-directionalsearch}, |
11 | Virginia Torczon writes : "Margaret Wright has stated that over | 17 | Virginia Torczon writes: "Margaret Wright has stated that over |
12 | fifty percent of the calls received by the support group for the NAG | 18 | fifty percent of the calls received by the support group for the NAG |
13 | software library concerned the version of the Nelder-Mead | 19 | software library concerned the version of the Nelder-Mead |
14 | simplex algorithm to be found in that library". No derivative of the cost function is | 20 | simplex algorithm to be found in that library". No derivative of the cost function is |
@@ -23,13 +29,18 @@ function values. The algorithm attemps to replace the worst vertex with | |||
23 | a new point, which depends on the worst point and the centre of the best | 29 | a new point, which depends on the worst point and the centre of the best |
24 | vertices. | 30 | vertices. |
25 | 31 | ||
26 | The goal of this toolbox is to provide a Nelder-Mead direct search optimization method to solve the | 32 | The goal of this toolbox is to provide a Nelder-Mead (1965) direct search optimization method to solve the |
27 | following unconstrained optimization problem | 33 | following unconstrained optimization problem |
28 | \begin{eqnarray} | 34 | \begin{eqnarray} |
29 | \min f(x) | 35 | \min f(x) |
30 | \end{eqnarray} | 36 | \end{eqnarray} |
31 | where $x\in \RR^n$ where $n$ is the number of optimization parameters. | 37 | where $x\in \RR^n$, $n$ is the number of optimization parameters and $f$ is the objective |
32 | The Box complex algorithm, which is an extension of Nelder-Mead's algorithm, solves the | 38 | function $f:\RR^n\rightarrow \RR$. |
39 | In order to solve the unconstrained optimization problem, the Nelder-Mead | ||
40 | algorithm uses a variable shape simplex. The toolbox also provide Spendley's et al. | ||
41 | algorithm \cite{Spendley1962} (1962), which uses a fixed shape simplex. Historically, the algorithm created | ||
42 | by Nelder and Mead was designed as an improvement on Spendley's et al. algorithm. | ||
43 | The Box complex algorithm \cite{Box1965} (1965), which is an extension of Spendley's et al. algorithm, solves the | ||
33 | following constrained problem | 44 | following constrained problem |
34 | \begin{eqnarray} | 45 | \begin{eqnarray} |
35 | &&\min f(x)\\ | 46 | &&\min f(x)\\ |
@@ -46,52 +57,308 @@ The Nelder-Mead algorithm may be used in the following optimization context : | |||
46 | \item there are bounds and/or non linear constraints. | 57 | \item there are bounds and/or non linear constraints. |
47 | \end{itemize} | 58 | \end{itemize} |
48 | 59 | ||
49 | The internal design of the system is based on the following components : | 60 | The internal design of the system is based on the following components. |
50 | \begin{itemize} | 61 | \begin{itemize} |
51 | \item "neldermead" provides various Nelder-Mead variants and manages for Nelder-Mead specific settings, such as the method to compute the initial simplex, the specific termination criteria, | 62 | \item The "neldermead" component provides various simplex-based |
52 | \item "fminsearch" provides a Scilab commands which aims at behaving as Matlab's fminsearch. Specific terminations criteria, initial simplex and auxiliary settings are automatically configured so that the behaviour of Matlab's fminsearch is exactly reproduced. | 63 | algorithms and manages for Nelder-Mead specific settings, such as the |
53 | \item "optimset", "optimget" provides Scilab commands to emulate their Matlab counterparts. | 64 | method to compute the initial simplex and the specific termination |
54 | \item "nmplot" provides a high-level component which provides directly output pictures for Nelder-Mead algorithm. | 65 | criteria. |
66 | \item The "fminsearch" component provides a Scilab commands which aims | ||
67 | at behaving as Matlab's fminsearch. Specific terminations criteria, | ||
68 | initial simplex and auxiliary settings are automatically configured so | ||
69 | that the behaviour of Matlab's fminsearch is exactly reproduced. | ||
70 | \item The "optimset" and "optimget" components provide Scilab commands | ||
71 | to emulate their Matlab counterparts. | ||
72 | \item The "nmplot" component provides features to | ||
73 | produce directly output pictures for Nelder-Mead algorithm. | ||
55 | \end{itemize} | 74 | \end{itemize} |
56 | The current toolbox is based on (and therefore requires) the following toolboxes | 75 | The current toolbox is based on (and therefore requires) the following components. |
57 | \begin{itemize} | 76 | \begin{itemize} |
58 | \item "optimbase" provides an abstract class for a general optimization component, including the number of variables, the minimum and maximum bounds, the number of non linear inequality constraints, the loggin system, various termination criteria, the cost function, etc... | 77 | \item The "optimbase" component provides an abstract class for a general optimization |
59 | \item "optimsimplex" provides a class to manage a simplex made of an arbitrary number of vertices, including the computation of a simplex by various methods (axes, regular, Pfeffer's, randomized bounds), the computation of the size by various methods (diameter, sigma +, sigma-, etc...), | 78 | component, including the number of variables, the minimum and maximum |
79 | bounds, the number of non linear inequality constraints, the logging | ||
80 | system, various termination criteria, the cost function, etc... | ||
81 | \item The "optimsimplex" component provides a class to manage a simplex made of an | ||
82 | arbitrary number of vertices, including the computation of a simplex by | ||
83 | various methods (axes, regular, Pfeffer's, randomized bounds), the | ||
84 | computation of the size by various methods (diameter, sigma +, sigma-, | ||
85 | etc...) and many algorithms to perform reflections and shrinkages. | ||
60 | \end{itemize} | 86 | \end{itemize} |
61 | 87 | ||
62 | The following is a list of features the Nelder-Mead prototype algorithm currently provides : | 88 | The following is a list of features the Nelder-Mead algorithm currently provides : |
63 | \begin{itemize} | 89 | \begin{itemize} |
64 | \item Manage various simplex initializations | 90 | \item manage various simplex initializations |
65 | \begin{itemize} | 91 | \begin{itemize} |
66 | \item initial simplex given by user, | 92 | \item initial simplex given by user, |
67 | \item initial simplex computed with a length and along the coordinate axes, | 93 | \item initial simplex computed with a length and along the coordinate axes, |
68 | \item initial regular simplex computed with Spendley et al. formula | 94 | \item initial regular simplex computed with Spendley et al. formula |
69 | \item initial simplex computed by a small perturbation around the initial guess point | 95 | \item initial simplex computed by a small perturbation around the initial guess point |
70 | \end{itemize} | 96 | \end{itemize} |
71 | \item Manage cost function | 97 | \item manage cost function |
72 | \begin{itemize} | 98 | \begin{itemize} |
73 | \item optionnal additionnal argument | 99 | \item optionnal additionnal argument |
74 | \item direct communication of the task to perform : cost function or inequality constraints | 100 | \item direct communication of the task to perform : cost function or inequality constraints |
75 | \end{itemize} | 101 | \end{itemize} |
76 | \item Manage various termination criteria, including maximum number of iterations, tolerance on function value (relative or absolute), | 102 | \item manage various termination criteria |
77 | \begin{itemize} | 103 | \begin{itemize} |
104 | \item maximum number of iterations, | ||
105 | \item tolerance on function value (relative or absolute), | ||
78 | \item tolerance on x (relative or absolute), | 106 | \item tolerance on x (relative or absolute), |
79 | \item tolerance on standard deviation of function value (original termination criteria in [3]), | 107 | \item tolerance on standard deviation of function value (original termination criteria in [3]), |
80 | \item maximum number of evaluations of cost function, | 108 | \item maximum number of evaluations of cost function, |
81 | \item absolute or relative simplex size, | 109 | \item absolute or relative simplex size, |
82 | \end{itemize} | 110 | \end{itemize} |
83 | \item Manage the history of the convergence, including | 111 | \item manage the history of the convergence, including : |
112 | \begin{itemize} | ||
113 | \item the history of function values, | ||
114 | \item the history of optimum point, | ||
115 | \item the history of simplices, | ||
116 | \item the history of termination criterias, | ||
117 | \end{itemize} | ||
118 | \item provide a plot command which allows to graphically see the history of the simplices toward the optimum, | ||
119 | \item provide query functions for | ||
84 | \begin{itemize} | 120 | \begin{itemize} |
85 | \item history of function values, | 121 | \item the status of the optimization process, |
86 | \item history of optimum point, | 122 | \item the number of iterations, |
87 | \item history of simplices, | 123 | \item the number of function evaluations, |
88 | \item history of termination criterias, | 124 | \item the status of execution, |
125 | \item the function value at initial point, | ||
126 | \item the function value at optimal point, | ||
127 | \item etc... | ||
89 | \end{itemize} | 128 | \end{itemize} |
90 | \item Provide a plot command which allows to graphically see the history of the simplices toward the optimum, | ||
91 | \item Provide query features for the status of the optimization process number of iterations, number of function evaluations, status of execution, function value at initial point, function value at optimal point, etc... | ||
92 | \item Spendley et al. fixed shaped algorithm, | 129 | \item Spendley et al. fixed shaped algorithm, |
93 | \item Kelley restart based on simplex gradient, | 130 | \item Kelley restart based on simplex gradient, |
94 | \item O'Neill restart based on factorial search around optimum, | 131 | \item O'Neill restart based on factorial search around optimum, |
95 | \item Box-like method managing bounds and nonlinear inequality constraints based on arbitrary number of vertices in the simplex. | 132 | \item Box-like method managing bounds and nonlinear inequality constraints based on arbitrary number of vertices in the simplex. |
96 | \end{itemize} | 133 | \end{itemize} |
97 | 134 | ||
135 | \section{How to use the Toolbox} | ||
136 | |||
137 | The design of the toolbox is based on the creation of | ||
138 | a new token by the \scifunction{neldermead\_new} function. | ||
139 | The Nelder-Mead object associated with this token can then | ||
140 | be configured with \scifunction{neldermead\_configure} and queried | ||
141 | with \scifunction{neldermead\_cget}. For example, the | ||
142 | \scifunction{neldermead\_configure} command allows to configure the | ||
143 | number of variables, the objective function and the initial guess. | ||
144 | |||
145 | The main command of the toolbox is the \scifunction{neldermead\_search} command, which | ||
146 | solves the optimization problem. After an optimization has been performed, | ||
147 | the \scifunction{neldermead\_get} command allows to retrieve the optimum $x^\star$, | ||
148 | as well as other parameters, such as the number of iterations performed, the number | ||
149 | of evaluations of the function, etc... | ||
150 | |||
151 | Once the optimization is finished, the \scifunction{neldermead\_destroy} function | ||
152 | deletes the object. | ||
153 | |||
154 | \section{An example} | ||
155 | |||
156 | In the following example, we search the minimum of the 2D Rosenbrock function \cite{citeulike:1903787}, | ||
157 | defined by | ||
158 | \begin{eqnarray} | ||
159 | f(x_1,x_2) = 100(x_2 - x_1)^2 + (1-x_1)^2 | ||
160 | \end{eqnarray} | ||
161 | |||
162 | The following Scilab script allows to find the solution of the problem. | ||
163 | We begin by defining the function \scifunction{rosenbrock} which computes the Rosenbrock function. | ||
164 | The traditionnal initial guess $(-1.2 , 1.0)$ is used, which corresponds | ||
165 | to the "-x0" key. The initial simplex is computed along | ||
166 | the axes with a length equal to 0.1. We want to use the Nelder-Mead algorithm with variable simplex size | ||
167 | is used, which corresponds to the "variable" value of the "-method" option. | ||
168 | The verbose mode is enabled so that messages are generated during the algorithm. | ||
169 | After the optimization is performed, the optimum is retrieved with quiery features. | ||
170 | |||
171 | \lstset{language=Scilab} | ||
172 | \lstset{numbers=left} | ||
173 | \lstset{basicstyle=\footnotesize} | ||
174 | \lstset{keywordstyle=\bfseries} | ||
175 | \begin{lstlisting} | ||
176 | function y = rosenbrock (x) | ||
177 | y = 100*(x(2)-x(1)^2)^2 + (1-x(1))^2; | ||
178 | endfunction | ||
179 | nm = neldermead_new (); | ||
180 | nm = neldermead_configure(nm,"-numberofvariables",2); | ||
181 | nm = neldermead_configure(nm,"-x0",[-1.2 1.0]'); | ||
182 | nm = neldermead_configure(nm,"-simplex0method","axes"); | ||
183 | nm = neldermead_configure(nm,"-simplex0length",0.1); | ||
184 | nm = neldermead_configure(nm,"-method","variable"); | ||
185 | nm = neldermead_configure(nm,"-verbose",1); | ||
186 | nm = neldermead_configure(nm,"-function",rosenbrock); | ||
187 | nm = neldermead_search(nm); | ||
188 | xopt = neldermead_get(nm,"-xopt") | ||
189 | fopt = neldermead_get(nm,"-fopt") | ||
190 | status = neldermead_get(nm,"-status") | ||
191 | nm = neldermead_destroy(nm); | ||
192 | \end{lstlisting} | ||
193 | |||
194 | This produces the following output. | ||
195 | |||
196 | \lstset{language=Scilab} | ||
197 | \lstset{numbers=left} | ||
198 | \lstset{basicstyle=\footnotesize} | ||
199 | \lstset{keywordstyle=\bfseries} | ||
200 | \begin{lstlisting} | ||
201 | -->nm = neldermead_search(nm); | ||
202 | Function Evaluation #1 is [24.2] at [-1.2 1] | ||
203 | Function Evaluation #1 is [24.2] at [-1.2 1] | ||
204 | Function Evaluation #2 is [8.82] at [-1.1 1] | ||
205 | Function Evaluation #3 is [16.4] at [-1.2 1.1] | ||
206 | Step #1 : order | ||
207 | ================================================================= | ||
208 | Iteration #1 (total = 1) | ||
209 | Function Eval #3 | ||
210 | Xopt : -1.1 1 | ||
211 | Fopt : 8.820000e+000 | ||
212 | DeltaFv : 1.538000e+001 | ||
213 | Center : -1.1666667 1.0333333 | ||
214 | Size : 1.414214e-001 | ||
215 | Vertex #1/3 : fv=8.820000e+000, x=-1.100000e+000 1.000000e+000 | ||
216 | Vertex #2/3 : fv=1.640000e+001, x=-1.200000e+000 1.100000e+000 | ||
217 | Vertex #3/3 : fv=2.420000e+001, x=-1.200000e+000 1.000000e+000 | ||
218 | Reflect | ||
219 | xbar=-1.15 1.05 | ||
220 | Function Evaluation #4 is [5.62] at [-1.1 1.1] | ||
221 | xr=[-1.1 1.1], f(xr)=5.620000 | ||
222 | Expand | ||
223 | Function Evaluation #5 is [4.428125] at [-1.05 1.15] | ||
224 | xe=-1.05 1.15, f(xe)=4.428125 | ||
225 | > Perform Expansion | ||
226 | Sort | ||
227 | [...] | ||
228 | ================================================================= | ||
229 | Iteration #56 (total = 56) | ||
230 | Function Eval #98 | ||
231 | Xopt : 0.6537880 0.4402918 | ||
232 | Fopt : 1.363828e-001 | ||
233 | DeltaFv : 1.309875e-002 | ||
234 | Center : 0.6788120 0.4503999 | ||
235 | Size : 6.945988e-002 | ||
236 | Vertex #1/3 : fv=1.363828e-001, x=6.537880e-001 4.402918e-001 | ||
237 | Vertex #2/3 : fv=1.474625e-001, x=7.107987e-001 4.799712e-001 | ||
238 | Vertex #3/3 : fv=1.494816e-001, x=6.718493e-001 4.309367e-001 | ||
239 | Reflect | ||
240 | xbar=0.6822933 0.4601315 | ||
241 | Function Evaluation #99 is [0.1033237] at [0.6927374 0.4893262] | ||
242 | xr=[0.6927374 0.4893262], f(xr)=0.103324 | ||
243 | Expand | ||
244 | Function Evaluation #100 is [0.1459740] at [0.7031815 0.5185210] | ||
245 | xe=0.7031815 0.5185210, f(xe)=0.145974 | ||
246 | > Perform reflection | ||
247 | Sort | ||
248 | ================================================================= | ||
249 | Iteration #57 (total = 57) | ||
250 | Function Eval #100 | ||
251 | Xopt : 0.6927374 0.4893262 | ||
252 | Fopt : 1.033237e-001 | ||
253 | DeltaFv : 4.413878e-002 | ||
254 | Center : 0.6857747 0.4698631 | ||
255 | Size : 6.262139e-002 | ||
256 | Vertex #1/3 : fv=1.033237e-001, x=6.927374e-001 4.893262e-001 | ||
257 | Vertex #2/3 : fv=1.363828e-001, x=6.537880e-001 4.402918e-001 | ||
258 | Vertex #3/3 : fv=1.474625e-001, x=7.107987e-001 4.799712e-001 | ||
259 | Terminate with status : maxfuneval | ||
260 | -->xopt = neldermead_get(nm,"-xopt") | ||
261 | xopt = | ||
262 | |||
263 | 0.6927374 | ||
264 | 0.4893262 | ||
265 | |||
266 | -->fopt = neldermead_get(nm,"-fopt") | ||
267 | fopt = | ||
268 | |||
269 | 0.1033237 | ||
270 | |||
271 | -->status = neldermead_get(nm,"-status") | ||
272 | status = | ||
273 | |||
274 | maxfuneval | ||
275 | \end{lstlisting} | ||
276 | |||
277 | \section{Help, demonstrations and unit tests} | ||
278 | |||
279 | For a complete presentation of the functions and options, the reader | ||
280 | should consult the help which is provided with the component. | ||
281 | The main menu of the help associated with the optimization | ||
282 | module is presented in figures \ref{fig-intro-help} and \ref{fig-intro-helpfminsearch}. | ||
283 | The corresponding pages provide a complete documentation for the | ||
284 | corresponding functions, as well as many sample uses. | ||
285 | |||
286 | \begin{figure} | ||
287 | \begin{center} | ||
288 | \includegraphics[width=15cm]{introduction-help.png} | ||
289 | \end{center} | ||
290 | \caption{Built-in help for the Nelder-Mead component} | ||
291 | \label{fig-intro-help} | ||
292 | \end{figure} | ||
293 | |||
294 | \begin{figure} | ||
295 | \begin{center} | ||
296 | \includegraphics[width=15cm]{introduction-help-fminsearch.png} | ||
297 | \end{center} | ||
298 | \caption{Built-in help for the \scifunction{fminsearch} function} | ||
299 | \label{fig-intro-helpfminsearch} | ||
300 | \end{figure} | ||
301 | |||
302 | Several demonstrations are provided with the component. These | ||
303 | are available from the "Demonstration" menu of the Scilab console | ||
304 | and are presented in figure \ref{fig-intro-demos}. | ||
305 | |||
306 | \begin{figure} | ||
307 | \begin{center} | ||
308 | \includegraphics[width=10cm]{introduction-demos.png} | ||
309 | \end{center} | ||
310 | \caption{Built-in demonstration scripts for the Nelder-Mead component} | ||
311 | \label{fig-intro-demos} | ||
312 | \end{figure} | ||
313 | |||
314 | The following script shows where the demonstration scripts are | ||
315 | available from the Scilab installation directory. | ||
316 | |||
317 | \lstset{language=Scilab} | ||
318 | \lstset{numbers=left} | ||
319 | \lstset{basicstyle=\footnotesize} | ||
320 | \lstset{keywordstyle=\bfseries} | ||
321 | \begin{lstlisting} | ||
322 | -->cd SCI/modules/optimization/demos/neldermead | ||
323 | ans = | ||
324 | |||
325 | D:\Programs\SCFD8E~1\modules\optimization\demos\neldermead | ||
326 | |||
327 | -->ls *.sce | ||
328 | ans = | ||
329 | |||
330 | !nmplot_rosenbrock.sce ! | ||
331 | ! ! | ||
332 | !nmplot_rosenbrock.fixed.sce ! | ||
333 | ! ! | ||
334 | !nmplot_quadratic.fixed.sce ! | ||
335 | ! ! | ||
336 | !nmplot_mckinnon2.sce ! | ||
337 | ! ! | ||
338 | !nmplot_mckinnon.sce ! | ||
339 | ! ! | ||
340 | !nmplot_han2.sce ! | ||
341 | ! ! | ||
342 | !nmplot_han1.sce ! | ||
343 | ! ! | ||
344 | !nmplot_boxproblemA.sce ! | ||
345 | ! ! | ||
346 | !neldermead_rosenbrock.sce ! | ||
347 | ! ! | ||
348 | !neldermead.dem.sce ! | ||
349 | ! ! | ||
350 | !fminsearch.sce ! | ||
351 | \end{lstlisting} | ||
352 | |||
353 | These components were developped based on unit tests, which are | ||
354 | provided with Scilab. | ||
355 | These unit tests are located in the "SCI/modules/optimization/tests/unit\_tests" | ||
356 | directory, under the "neldermead", "optimsimplex" and "optimbase" directories. | ||
357 | Each unit test correspond to a .tst file. These tests are covering most | ||
358 | (if not all) the features provided by the components. This is why there are | ||
359 | a good source of information on how to use the functions. | ||
360 | |||
361 | |||
362 | |||
363 | |||
364 | |||
diff --git a/scilab_doc/neldermead/method-neldermead.tex b/scilab_doc/neldermead/method-neldermead.tex index 22529dc..8e9fd1f 100644 --- a/scilab_doc/neldermead/method-neldermead.tex +++ b/scilab_doc/neldermead/method-neldermead.tex | |||
@@ -1,10 +1,30 @@ | |||
1 | \chapter{Nelder-Mead method} | 1 | \chapter{Nelder-Mead method} |
2 | 2 | ||
3 | \section{Analysis} | 3 | In this chapter, we present Nelder and Mead's \cite{citeulike:3009487} algorithm. |
4 | We begin by the analysis of the algorithm, which is based on a variable shape simplex. | ||
5 | Then, we present geometric situations where the various steps of the algorithm | ||
6 | might be used. In the third part, we present the rate of convergence toward the optimum of | ||
7 | the Nelder-Mead algorithm. This part is mainly based on Han and Neumann's paper \cite{HanNeumann2006}, | ||
8 | which is based on a class of quadratic functions with a special initial | ||
9 | simplex. The core of this chapter is the analysis of several numerical | ||
10 | experiments which have been performed with the neldermead component. | ||
11 | We analyse the behaviour of the algorithm on quadratic functions and | ||
12 | present several counter examples where the Nelder-Mead algorithm is | ||
13 | known to fail. | ||
14 | |||
15 | \section{Introduction} | ||
16 | |||
17 | The goal of Nelder and Mead algorithm is to solve the | ||
18 | following unconstrained optimization problem | ||
19 | \begin{eqnarray} | ||
20 | \min f(x) | ||
21 | \end{eqnarray} | ||
22 | where $x\in \RR^n$, $n$ is the number of optimization parameters and $f$ is the objective | ||
23 | function $f:\RR^n\rightarrow \RR$. | ||
4 | 24 | ||
5 | The Nelder-Mead method is an improvment over the Spendley's et al. | 25 | The Nelder-Mead method is an improvment over the Spendley's et al. |
6 | method with the goal of allowing the simplex to vary in shape. | 26 | method with the goal of allowing the simplex to vary in shape. |
7 | The Nelder-Mead algorithm makes use of four parameters : the | 27 | The Nelder-Mead algorithm makes use of four parameters: the |
8 | coefficient of reflection $\rho$, expansion $\chi$, | 28 | coefficient of reflection $\rho$, expansion $\chi$, |
9 | contraction $\gamma$ and shrinkage $\sigma$. | 29 | contraction $\gamma$ and shrinkage $\sigma$. |
10 | When the expansion or contraction steps are performed, the shape | 30 | When the expansion or contraction steps are performed, the shape |
@@ -178,17 +198,17 @@ shrink steps are rare. | |||
178 | \label{fig-nm-moves-shrinkafterco} | 198 | \label{fig-nm-moves-shrinkafterco} |
179 | \end{figure} | 199 | \end{figure} |
180 | 200 | ||
181 | \subsection{Termination criteria} | 201 | %\subsection{Termination criteria} |
182 | 202 | ||
183 | TODO... | 203 | %TODO... |
184 | 204 | ||
185 | \section{Convergence on a quadratic} | 205 | \section{Convergence properties on a quadratic} |
186 | 206 | ||
187 | In this section, we reproduce one result | 207 | In this section, we reproduce one result |
188 | presented by Han and Neumann \cite{HanNeumann2006}, which states | 208 | presented by Han and Neumann \cite{HanNeumann2006}, which states |
189 | the rate of convergence toward the optimum on a class of quadratic functions with a special initial | 209 | the rate of convergence toward the optimum on a class of quadratic functions with a special initial |
190 | simplex. | 210 | simplex. |
191 | See also the Phd by Lixing Han in 2000 \cite{Han2000}. | 211 | Some additionnal results are also presented in the Phd thesis by Lixing Han in 2000 \cite{Han2000}. |
192 | We study a generalized quadratic and use a particular | 212 | We study a generalized quadratic and use a particular |
193 | initial simplex. We show that the vertices follow | 213 | initial simplex. We show that the vertices follow |
194 | a recurrence equation, which is associated with a caracteristic | 214 | a recurrence equation, which is associated with a caracteristic |
@@ -735,6 +755,8 @@ caracteristic equation} | |||
735 | \label{fig-nm-roots-variable} | 755 | \label{fig-nm-roots-variable} |
736 | \end{figure} | 756 | \end{figure} |
737 | 757 | ||
758 | |||
759 | |||
738 | \section{Numerical experiments} | 760 | \section{Numerical experiments} |
739 | 761 | ||
740 | In this section, we present some numerical experiments | 762 | In this section, we present some numerical experiments |
@@ -819,7 +841,7 @@ figure \ref{fig-nm-numexp1-logfopt}. | |||
819 | \label{fig-nm-numexp1-logfopt} | 841 | \label{fig-nm-numexp1-logfopt} |
820 | \end{figure} | 842 | \end{figure} |
821 | 843 | ||
822 | \subsection{Badly scaled quadratic function} | 844 | \subsubsection{Badly scaled quadratic function} |
823 | 845 | ||
824 | The function we try to minimize is the following quadratic | 846 | The function we try to minimize is the following quadratic |
825 | in 2 dimensions | 847 | in 2 dimensions |
diff --git a/scilab_doc/neldermead/method-spendley.tex b/scilab_doc/neldermead/method-spendley.tex index a3725c0..c8d58a0 100644 --- a/scilab_doc/neldermead/method-spendley.tex +++ b/scilab_doc/neldermead/method-spendley.tex | |||
@@ -12,7 +12,7 @@ are involving only 2 variables and are based on a quadratic function. | |||
12 | The last numerical experiment explores the behaviour of the algorith | 12 | The last numerical experiment explores the behaviour of the algorith |
13 | when the number of variables increases. | 13 | when the number of variables increases. |
14 | 14 | ||
15 | \section{Analysis} | 15 | \section{Introduction} |
16 | 16 | ||
17 | In this section, we present Spendley's et al algorithm for unconstrained optimization. | 17 | In this section, we present Spendley's et al algorithm for unconstrained optimization. |
18 | This algorithm is based on the iterative update of a simplex. | 18 | This algorithm is based on the iterative update of a simplex. |
@@ -24,6 +24,14 @@ or a shrink is performed in practice. | |||
24 | 24 | ||
25 | \subsection{Algorithm} | 25 | \subsection{Algorithm} |
26 | 26 | ||
27 | The goal of Spendley's et al. algorithm is to solve the | ||
28 | following unconstrained optimization problem | ||
29 | \begin{eqnarray} | ||
30 | \min f(x) | ||
31 | \end{eqnarray} | ||
32 | where $x\in \RR^n$, $n$ is the number of optimization parameters and $f$ is the objective | ||
33 | function $f:\RR^n\rightarrow \RR$. | ||
34 | |||
27 | The simplex algorithms are based on the iterative update of | 35 | The simplex algorithms are based on the iterative update of |
28 | a \emph{simplex} made of $n+1$ points $S=\{x_i\}_{i=1,n+1}$. Each point | 36 | a \emph{simplex} made of $n+1$ points $S=\{x_i\}_{i=1,n+1}$. Each point |
29 | in the simplex is called a \emph{vertex} and is associated with | 37 | in the simplex is called a \emph{vertex} and is associated with |
diff --git a/scilab_doc/neldermead/neldermead-introduction-so.pdf b/scilab_doc/neldermead/neldermead-introduction-so.pdf new file mode 100644 index 0000000..c2bf429 --- /dev/null +++ b/scilab_doc/neldermead/neldermead-introduction-so.pdf | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/neldermead-introduction-so.tex b/scilab_doc/neldermead/neldermead-introduction-so.tex new file mode 100644 index 0000000..6a05c35 --- /dev/null +++ b/scilab_doc/neldermead/neldermead-introduction-so.tex | |||
@@ -0,0 +1,85 @@ | |||
1 | % | ||
2 | % neldermead.tex -- | ||
3 | % Some notes about Nelder-Mead algorithms. | ||
4 | % | ||
5 | % Copyright 2008-2009 Michael Baudin | ||
6 | % | ||
7 | \documentclass[12pt]{report} | ||
8 | |||
9 | \include{macros} | ||
10 | |||
11 | \begin{document} | ||
12 | %% User defined page headers | ||
13 | \pagestyle{fancyplain} | ||
14 | \renewcommand{\chaptermark}[1]{\markboth{\chaptername\ \thechapter. #1}{}} | ||
15 | \renewcommand{\sectionmark}[1]{\markright{\thesection. #1}} | ||
16 | \lhead[]{\fancyplain{}{\bfseries\leftmark}} | ||
17 | \rhead[]{\fancyplain{}{\bfseries\thepage}} | ||
18 | \cfoot{} | ||
19 | |||
20 | %% User defined figure legends | ||
21 | \makeatletter | ||
22 | \def\figurename{{\protect\sc \protect\small\bfseries Fig.}} | ||
23 | \def\f@ffrench{\protect\figurename\space{\protect\small\bf \thefigure}\space} | ||
24 | \let\fnum@figure\f@ffrench% | ||
25 | \let\captionORI\caption | ||
26 | \def\caption#1{\captionORI{\rm\small #1}} | ||
27 | \makeatother | ||
28 | |||
29 | %% First page | ||
30 | \thispagestyle{empty} | ||
31 | { | ||
32 | \begin{center} | ||
33 | %% Comment for DVI | ||
34 | \includegraphics[height=40mm]{scilab_logo} | ||
35 | \vskip4cm | ||
36 | |||
37 | %% Empty space between the box and the text | ||
38 | \fboxsep6mm | ||
39 | %% Box thickness | ||
40 | \fboxrule1.3pt | ||
41 | \Huge | ||
42 | $$\fbox{$ | ||
43 | \begin{array}{c} | ||
44 | \textbf{Nelder-Mead}\\ | ||
45 | \textbf{Toolbox Manual}\\ | ||
46 | \textbf{-- Introduction --}\\ | ||
47 | \end{array} | ||
48 | $} | ||
49 | $$ | ||
50 | \end{center} | ||
51 | \vskip4cm | ||
52 | |||
53 | \normalsize | ||
54 | |||
55 | \begin{flushright} | ||
56 | Version 0.2 \\ | ||
57 | September 2009 | ||
58 | \end{flushright} | ||
59 | |||
60 | \begin{flushright} | ||
61 | Micha\"el BAUDIN | ||
62 | \end{flushright} | ||
63 | |||
64 | \clearpage | ||
65 | |||
66 | %% Table of contents | ||
67 | \renewcommand{\baselinestretch}{1.30}\small \normalsize | ||
68 | |||
69 | \tableofcontents | ||
70 | |||
71 | \renewcommand{\baselinestretch}{1.18}\small \normalsize | ||
72 | |||
73 | \include{introduction} | ||
74 | |||
75 | \clearpage | ||
76 | |||
77 | |||
78 | %% Bibliography | ||
79 | |||
80 | \addcontentsline{toc}{chapter}{Bibliography} | ||
81 | \bibliographystyle{plain} | ||
82 | \bibliography{neldermead} | ||
83 | |||
84 | \end{document} | ||
85 | |||
diff --git a/scilab_doc/neldermead/neldermead-simplex-so.pdf b/scilab_doc/neldermead/neldermead-simplex-so.pdf index 8c94d37..6c69bb3 100644 --- a/scilab_doc/neldermead/neldermead-simplex-so.pdf +++ b/scilab_doc/neldermead/neldermead-simplex-so.pdf | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/neldermead-spendley-so.pdf b/scilab_doc/neldermead/neldermead-spendley-so.pdf index 454e180..ecb3671 100644 --- a/scilab_doc/neldermead/neldermead-spendley-so.pdf +++ b/scilab_doc/neldermead/neldermead-spendley-so.pdf | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/neldermead.bib b/scilab_doc/neldermead/neldermead.bib index 0ee9803..a92538f 100644 --- a/scilab_doc/neldermead/neldermead.bib +++ b/scilab_doc/neldermead/neldermead.bib | |||
@@ -27,7 +27,6 @@ doi = {10.1137/S1052623496303470} | |||
27 | posted-at = {2008-09-15 16:23:09}, | 27 | posted-at = {2008-09-15 16:23:09}, |
28 | priority = {2}, | 28 | priority = {2}, |
29 | title = {A Simplex Method for Function Minimization}, | 29 | title = {A Simplex Method for Function Minimization}, |
30 | note = {\url{http://dx.doi.org/10.1093/comjnl/7.4.308}}, | ||
31 | volume = {7}, | 30 | volume = {7}, |
32 | year = {1965} | 31 | year = {1965} |
33 | } | 32 | } |
diff --git a/scilab_doc/neldermead/neldermead.pdf b/scilab_doc/neldermead/neldermead.pdf index c36eb3f..86443c0 100644 --- a/scilab_doc/neldermead/neldermead.pdf +++ b/scilab_doc/neldermead/neldermead.pdf | |||
Binary files differ | |||
diff --git a/scilab_doc/neldermead/neldermead.tex b/scilab_doc/neldermead/neldermead.tex index c44550a..a199c9b 100644 --- a/scilab_doc/neldermead/neldermead.tex +++ b/scilab_doc/neldermead/neldermead.tex | |||
@@ -32,7 +32,7 @@ | |||
32 | \begin{center} | 32 | \begin{center} |
33 | %% Comment for DVI | 33 | %% Comment for DVI |
34 | \includegraphics[height=40mm]{scilab_logo} | 34 | \includegraphics[height=40mm]{scilab_logo} |
35 | \vskip4cm | 35 | \vskip2cm |
36 | 36 | ||
37 | %% Empty space between the box and the text | 37 | %% Empty space between the box and the text |
38 | \fboxsep6mm | 38 | \fboxsep6mm |
@@ -42,23 +42,49 @@ | |||
42 | $$\fbox{$ | 42 | $$\fbox{$ |
43 | \begin{array}{c} | 43 | \begin{array}{c} |
44 | \textbf{Nelder-Mead}\\ | 44 | \textbf{Nelder-Mead}\\ |
45 | \textbf{Toolbox Manual} | 45 | \textbf{User's Manual} |
46 | \end{array} | 46 | \end{array} |
47 | $} | 47 | $} |
48 | $$ | 48 | $$ |
49 | \end{center} | 49 | \end{center} |
50 | \vskip4cm | ||
51 | 50 | ||
52 | \normalsize | 51 | \vskip1cm |
52 | |||
53 | \begin{center} | ||
54 | \begin{large} | ||
55 | Micha\"el BAUDIN | ||
56 | \end{large} | ||
57 | \end{center} | ||
58 | |||
59 | \vskip2cm | ||
60 | |||
61 | \textbf{Abstract} | ||
62 | |||
63 | In this document, we present the Nelder-Mead component provided in Scilab. | ||
64 | The introduction gives a brief overview of the optimization features of the | ||
65 | component and present an introductory example. Then we present some theory | ||
66 | associated with the simplex, a geometric concept which is central in | ||
67 | the Nelder-Mead algorithm. We present several method to compute an | ||
68 | initial simplex. Then we present Spendley's et al. fixed shape unconstrained | ||
69 | optimization algorithm. Several numerical experiments are provided, which | ||
70 | shows how this algorithm performs on well-scaled and badly scaled quadratics. | ||
71 | In the final section, we present the Nelder-Mead variable shape | ||
72 | unconstrained optimization algorithm. Several numerical experiments are presented, | ||
73 | where some of these are counter examples, that is cases where the algorithms | ||
74 | fails to converge on a stationnary point. In the appendix of this document, | ||
75 | the interested reader will find a bibliography of simplex-based algorithms, | ||
76 | along with an analysis of the various implementations which are available | ||
77 | in several programming languages. | ||
78 | |||
79 | \vskip1cm | ||
80 | |||
53 | 81 | ||
54 | \begin{flushright} | 82 | \begin{flushright} |
55 | Version 0.2 \\ | 83 | Version 0.3 \\ |
56 | September 2009 | 84 | September 2009 |
57 | \end{flushright} | 85 | \end{flushright} |
58 | 86 | ||
59 | \begin{flushright} | 87 | |
60 | Micha\"el BAUDIN | ||
61 | \end{flushright} | ||
62 | 88 | ||
63 | \clearpage | 89 | \clearpage |
64 | 90 | ||
@@ -70,25 +96,18 @@ Micha\"el BAUDIN | |||
70 | \renewcommand{\baselinestretch}{1.18}\small \normalsize | 96 | \renewcommand{\baselinestretch}{1.18}\small \normalsize |
71 | 97 | ||
72 | \include{introduction} | 98 | \include{introduction} |
73 | |||
74 | \include{overview} | ||
75 | |||
76 | \include{section-simplex} | 99 | \include{section-simplex} |
77 | |||
78 | \include{method-spendley} | 100 | \include{method-spendley} |
79 | |||
80 | \include{method-neldermead} | 101 | \include{method-neldermead} |
81 | |||
82 | \include{conclusion} | 102 | \include{conclusion} |
83 | |||
84 | \include{acknowledgments} | 103 | \include{acknowledgments} |
85 | 104 | ||
86 | \clearpage | 105 | \clearpage |
87 | 106 | ||
88 | %% Appendix | 107 | %% Appendix |
89 | \appendix | 108 | \appendix |
90 | \include{nmbibliography} | ||
91 | 109 | ||
110 | \include{nmbibliography} | ||
92 | \include{implementations} | 111 | \include{implementations} |
93 | 112 | ||
94 | %% Bibliography | 113 | %% Bibliography |
diff --git a/scilab_doc/neldermead/overview.tex b/scilab_doc/neldermead/overview.tex deleted file mode 100644 index 9e044d1..0000000 --- a/scilab_doc/neldermead/overview.tex +++ /dev/null | |||
@@ -1,74 +0,0 @@ | |||
1 | \chapter{Overview} | ||
2 | |||
3 | In this section, we present the main commands of the Nelder-Mead | ||
4 | toolbox as well as an example of use. | ||
5 | |||
6 | |||
7 | \section{How to use the Toolbox} | ||
8 | |||
9 | The design of the toolbox is based on the creation of | ||
10 | a new token by the \scifunction{neldermead\_new} command. | ||
11 | The Nelder-Mead object associated with this token can then | ||
12 | be configured with \scifunction{neldermead\_configure} and queried | ||
13 | with \scifunction{neldermead\_cget}. To be more specific, the | ||
14 | \scifunction{neldermead\_configure} command allows to configure the | ||
15 | number of variables, the objective function and the initial guess. | ||
16 | |||
17 | The main command of the toolbox is the \scifunction{neldermead\_search} command, which | ||
18 | solves the optimization problem. After an optimization has been performed, | ||
19 | the \scifunction{neldermead\_get} command allows to retrieve the optimum $x^\star$, | ||
20 | as well as other parameters, such as the number of iterations performed, the number | ||
21 | of evaluations of the function, etc... | ||
22 | |||
23 | \section{An example} | ||
24 | |||
25 | In the following example, one searches the minimum of the 2D Rosenbrock function \cite{citeulike:1903787}, | ||
26 | defined by | ||
27 | \begin{eqnarray} | ||
28 | f(x_1,x_2) = 100(x_2 - x_1)^2 + (1-x_1)^2 | ||
29 | \end{eqnarray} | ||
30 | |||
31 | One begins by defining the function "rosenbrock" which computes the Rosenbrock function. | ||
32 | The traditionnal initial guess $(-1.2 , 1.0)$ is used. The initial simplex is computed along | ||
33 | the axes with a length equal to 0.1. The Nelder-Mead algorithm with variable simplex size | ||
34 | is used. The verbose mode is enabled so that messages are generated during the algorithm. | ||
35 | After the optimization is performed, the optimum is retrieved with quiery features. | ||
36 | |||
37 | \lstset{language=Scilab} | ||
38 | \lstset{numbers=left} | ||
39 | \lstset{basicstyle=\footnotesize} | ||
40 | \lstset{keywordstyle=\bfseries} | ||
41 | \begin{lstlisting} | ||
42 | |||
43 | function y = rosenbrock (x) | ||
44 | y = 100*(x(2)-x(1)^2)^2 + (1-x(1))^2; | ||
45 | endfunction | ||
46 | |||
47 | nm = neldermead_new (); | ||
48 | nm = neldermead_configure(nm,"-x0",[-1.2 1.0]'); | ||
49 | nm = neldermead_configure(nm,"-simplex0method","axes"); | ||
50 | nm = neldermead_configure(nm,"-simplex0length",0.1); | ||
51 | nm = neldermead_configure(nm,"-method","variable"); | ||
52 | nm = neldermead_configure(nm,"-verbose",1); | ||
53 | nm = neldermead_configure(nm,"-function",rosenbrock); | ||
54 | nm = neldermead_search(nm); | ||
55 | xopt = neldermead_get(nm,"-xopt"); | ||
56 | fopt = neldermead_get(nm,"-fopt"); | ||
57 | historyfopt = neldermead_get(nm,"-historyfopt"); | ||
58 | iterations = neldermead_get(nm,"-iterations"); | ||
59 | historyxopt = neldermead_get(nm,"-historyxopt"); | ||
60 | historysimplex = neldermead_get(nm,"-historysimplex"); | ||
61 | fx0 = neldermead_get(nm,"-fx0"); | ||
62 | status = neldermead_get(nm,"-status"); | ||
63 | nm = neldermead_destroy(nm); | ||
64 | \end{lstlisting} | ||
65 | |||
66 | The script makes the hypothesis that an environment variable | ||
67 | named TOOLBOX\_HOME contains the path to directory | ||
68 | which contains the toolbox, which is stored in the "neldermead" directory. | ||
69 | |||
70 | For a deeper presentation of the commands and options, the reader | ||
71 | should consult the help which is provided with the package. | ||
72 | |||
73 | |||
74 | |||
diff --git a/scilab_doc/neldermead/section-simplex.tex b/scilab_doc/neldermead/section-simplex.tex index b9fb8d3..c56bc53 100644 --- a/scilab_doc/neldermead/section-simplex.tex +++ b/scilab_doc/neldermead/section-simplex.tex | |||
@@ -4,14 +4,17 @@ In this section, we present the various definitions connected | |||
4 | to simplex algorithms. We introduce several methods to measure | 4 | to simplex algorithms. We introduce several methods to measure |
5 | the size of a simplex, including the oriented length. | 5 | the size of a simplex, including the oriented length. |
6 | We present several methods to compute an | 6 | We present several methods to compute an |
7 | initial simplex, for example the regular simplex used by Spendley et al.. | 7 | initial simplex, that is, the regular simplex used by Spendley et al., |
8 | We also present the simplex gradient, which is a forward or a centered | 8 | the axis-by-axis simplex, Pfeffer's simplex and the randomized |
9 | difference formula for the gradient of the cost function. | 9 | bounds simplex. |
10 | The core of this section is from \cite{Kelley1999}. | 10 | %We also present the simplex gradient, which is a forward or a centered |
11 | %difference formula for the gradient of the cost function. | ||
12 | %The core of this section is from \cite{Kelley1999}. | ||
11 | 13 | ||
12 | \section{The simplex} | 14 | \section{The simplex} |
13 | 15 | ||
14 | A \emph{simplex} $S$ in $\RR^n$ is the convex hull of $n+1$ points $S=\{\bx_i\}_{i=1,n+1}$. | 16 | A \emph{simplex} $S$ in $\RR^n$ is the convex hull of $n+1$ points $S=\{\bx_i\}_{i=1,n+1}$, |
17 | where $\bx_i \in \RR^n$ for $i=1,n+1$. | ||
15 | 18 | ||
16 | Box extended the Nelder-Mead algorithm to handle bound and non linear constraints \cite{Box1965}. | 19 | Box extended the Nelder-Mead algorithm to handle bound and non linear constraints \cite{Box1965}. |
17 | To be able to manage difficult cases, he uses a \emph{complex} made of $k\geq n+1$ vertices. | 20 | To be able to manage difficult cases, he uses a \emph{complex} made of $k\geq n+1$ vertices. |
@@ -19,13 +22,14 @@ In this section, we will state clearly when the definition and results can be ap | |||
19 | Indeed, some definitions such as the simplex gradient cannot be extended to a \emph{complex} | 22 | Indeed, some definitions such as the simplex gradient cannot be extended to a \emph{complex} |
20 | and are only applicable to a \emph{simplex}. | 23 | and are only applicable to a \emph{simplex}. |
21 | 24 | ||
22 | The point $\bx_i\in\RR^n$ is the $i$-th vertex of $S$. Given a function $f(\bx)\in\RR$, | 25 | The point $\bx_i\in\RR^n$ is the $i$-th vertex of $S$, |
26 | where $x_i^j\in\RR$ is the $j$-th coordinate of the $i$-th vertex of the simplex $S$. | ||
27 | Given a function $f(\bx)\in\RR$, | ||
23 | each vertex is associated with a function value $f_i = f(\bx_i)$ for $i=1,n+1$. | 28 | each vertex is associated with a function value $f_i = f(\bx_i)$ for $i=1,n+1$. |
24 | In simplex algorithms, the vertex are sorted by increasing function values | 29 | In simplex algorithms, the vertex are sorted by increasing function values |
25 | |||
26 | \begin{eqnarray} | 30 | \begin{eqnarray} |
27 | \label{simplex-sortedfv} | 31 | \label{simplex-sortedfv} |
28 | f_1 \leq f_2 \leq \ldots \leq f_n \leq f_{n+1} | 32 | f_1 \leq f_2 \leq \ldots \leq f_n \leq f_{n+1}. |
29 | \end{eqnarray} | 33 | \end{eqnarray} |
30 | 34 | ||
31 | The sorting order is not precisely defined neither in Spendley's et al paper \cite{Spendley1962} | 35 | The sorting order is not precisely defined neither in Spendley's et al paper \cite{Spendley1962} |
@@ -37,25 +41,27 @@ ordering rules have no measurable influence. | |||
37 | Let $V$ denote the $n\times n$ matrix of simplex directions | 41 | Let $V$ denote the $n\times n$ matrix of simplex directions |
38 | \begin{eqnarray} | 42 | \begin{eqnarray} |
39 | \label{simplex-directions} | 43 | \label{simplex-directions} |
40 | V(S) = (\bx_2 - \bx_1, \bx_3 - \bx_1 , \ldots , \bx_{n+1} - \bx_1) = (\bv_1, \ldots , \bv_n) | 44 | V(S) = (\bx_2 - \bx_1, \bx_3 - \bx_1 , \ldots , \bx_{n+1} - \bx_1) = (\bv_1, \ldots , \bv_n). |
41 | \end{eqnarray} | 45 | \end{eqnarray} |
42 | 46 | ||
43 | We say that $S$ is nonsingular if the matrix of simplex directions $V(S)$ is nonsingular. | 47 | We say that the simplex $S$ is nonsingular if the matrix of simplex directions $V(S)$ is nonsingular. |
44 | |||
45 | 48 | ||
46 | \section{The size of the simplex} | 49 | \section{The size of the simplex} |
50 | \label{section-simplexsize} | ||
47 | 51 | ||
48 | Several methods are available to compute the size of a simplex. | 52 | Several methods are available to compute the size of a simplex. |
49 | In Kelley's book \cite{Kelley1999}, the author presents the diameter and the two oriented lengths. | ||
50 | 53 | ||
51 | The simplex diameter $diam(S)$ is defined by | 54 | The simplex diameter $diam(S)$ is defined by |
52 | \begin{eqnarray} | 55 | \begin{eqnarray} |
53 | \label{simplex-diameter} | 56 | \label{simplex-diameter} |
54 | diam(S) = \max_{i,j=1,n+1} \|\bx_i - \bx_j\|_2, | 57 | diam(S) = \max_{i,j=1,n+1} \|\bx_i - \bx_j\|_2, |
55 | \end{eqnarray} | 58 | \end{eqnarray} |
56 | where $\|.\|_2$ is the euclidian norm $\|x\|_2 = \sum_{i=1,n}\bx_i^2$. | 59 | where $\|.\|_2$ is the euclidian norm defined by |
60 | \begin{eqnarray} | ||
61 | \|\bx\|_2 = \sum_{i=1,n}(x^j)^2. | ||
62 | \end{eqnarray} | ||
57 | In practical implementations, computing the diameter requires two nested loops over the | 63 | In practical implementations, computing the diameter requires two nested loops over the |
58 | vertices of the simplex, i.e. $(n+1)^2$ operations. This is why authors generally | 64 | vertices of the simplex, i.e. requires $(n+1)^2$ operations. This is why authors generally |
59 | prefer to use lengths which are less expensive to compute. | 65 | prefer to use lengths which are less expensive to compute. |
60 | 66 | ||
61 | The two oriented lengths $\sigma_-(S)$ and $\sigma_+(S)$ are using the | 67 | The two oriented lengths $\sigma_-(S)$ and $\sigma_+(S)$ are using the |
@@ -74,33 +80,33 @@ The following inequalities are satisfied between the diameter and the maximum or | |||
74 | \end{eqnarray} | 80 | \end{eqnarray} |
75 | 81 | ||
76 | In Nash's book \cite{nla.cat-vn1060620}, the size of the simplex $s_N(S)$ is measured | 82 | In Nash's book \cite{nla.cat-vn1060620}, the size of the simplex $s_N(S)$ is measured |
77 | based on the $l1$ norm and is defined by | 83 | based on the 1-norm and is defined by |
78 | \begin{eqnarray} | 84 | \begin{eqnarray} |
79 | \label{simplex-sizenash} | 85 | \label{simplex-sizenash} |
80 | s_N(S) = \sum_{i=2,n+1} \|\bx_i - \bx_1\|_1 | 86 | s_N(S) = \sum_{i=2,n+1} \|\bx_i - \bx_1\|_1 |
81 | \end{eqnarray} | 87 | \end{eqnarray} |
82 | where | 88 | where the 1-norm is defined by |
83 | \begin{eqnarray} | 89 | \begin{eqnarray} |
84 | \label{simplex-sizenash2} | 90 | \label{simplex-sizenash2} |
85 | \|\bx_i - \bx_1\|_1 = \sum_{j=1,n} |x_i^j - x_1^j| | 91 | \|\bx \|_1 = \sum_{j=1,n} |x^j| |
86 | \end{eqnarray} | 92 | \end{eqnarray} |
87 | where $x_i^j\in\RR$ is the $j$-th coordinate of the $i$-th vertex of the simplex $S$. | 93 | where $x_i^j\in\RR$ is the $j$-th coordinate of the $i$-th vertex of the simplex $S$. |
88 | 94 | ||
89 | \section{The initial simplex} | 95 | \section{The initial simplex} |
90 | 96 | ||
91 | While most of the theory can be developed without being very specific | 97 | While most of the theory can be developed without being very specific |
92 | about the initial simplex, the initial simplex plays a very important role in practice. | 98 | about the initial simplex, it plays a very important role in practice. |
93 | All approaches are based on the initial guess $\overline{\bx}_0\in\RR^n$ and create a | 99 | All approaches are based on the initial guess $\overline{\bx}_0\in\RR^n$ and create a |
94 | geometric shape based on this point. | 100 | geometric shape based on this point. |
95 | (We denoted the initial guess by $\overline{\bx}_0$ instead of the usual $\bx_0$ | 101 | (We denote the initial guess by $\overline{\bx}_0$ instead of the usual $\bx_0$ |
96 | in order to distinguish the initial guess from the vertices $\{\bx_i\}_{i=1,n+1}$.) | 102 | in order to distinguish the initial guess from the vertices $\{\bx_i\}_{i=1,n+1}$.) |
97 | 103 | ||
98 | 104 | ||
99 | In this section, we present the various approach to design the initial | 105 | In this section, we present the various approach to design the initial |
100 | simplex. In the first part, we emphasize the importance of the initial | 106 | simplex. In the first part, we emphasize the importance of the initial |
101 | simplex in optimization algorithms. Then we present the regular simplex | 107 | simplex in optimization algorithms. Then we present the regular simplex |
102 | approach by Spendley et al., the randomized bounds approach by Box and | 108 | by Spendley et al., the axis-by-axis simplex, the randomized bounds approach by Box and |
103 | Pfeffer's method. | 109 | Pfeffer's simplex. |
104 | 110 | ||
105 | \subsection{Importance of the initial simplex} | 111 | \subsection{Importance of the initial simplex} |
106 | 112 | ||
@@ -112,7 +118,7 @@ is typical a fixed-shape simplex algorithm (see \cite{Torczon89multi-directional | |||
112 | for other patterns of a direct search method). | 118 | for other patterns of a direct search method). |
113 | If, by chance, the pattern is so that the optimum is close to one point | 119 | If, by chance, the pattern is so that the optimum is close to one point |
114 | defined by the pattern, the number of iteration may be small. On the contrary, the | 120 | defined by the pattern, the number of iteration may be small. On the contrary, the |
115 | number of iterations may be high if the pattern does not come close to the | 121 | number of iterations may be large if the pattern does not come close to the |
116 | optimum. | 122 | optimum. |
117 | 123 | ||
118 | \begin{figure} | 124 | \begin{figure} |
@@ -127,11 +133,11 @@ The variable-shape simplex algorithm designed by Nelder and Mead is also very | |||
127 | sensitive to the initial simplex. | 133 | sensitive to the initial simplex. |
128 | One of the problems is that the initial simplex should be consistently scaled | 134 | One of the problems is that the initial simplex should be consistently scaled |
129 | with respect to the unknown $x$. | 135 | with respect to the unknown $x$. |
130 | In \cite{parkinson1972}, "An investigation into the efficiency of variants on the simplex method", | 136 | In "An investigation into the efficiency of variants on the simplex method" \cite{parkinson1972}, |
131 | Parkinson and Hutchinson explored | 137 | Parkinson and Hutchinson explored |
132 | several ways of improvement. First, they investigate the sensitivity | 138 | several improvements of Nelder and Mead's algorithm. First, they investigate the sensitivity |
133 | of the algorithm to the initial simplex. Two parameters were investigated, | 139 | of the algorithm to the initial simplex. Two parameters were investigated, |
134 | i.e. the initial length and the orientation of the simplex. | 140 | that is, the initial length and the orientation of the simplex. |
135 | The conclusion of their study with respect to the initial simplex is | 141 | The conclusion of their study with respect to the initial simplex is |
136 | the following. "The orientation of the initial simplex has a significant effect | 142 | the following. "The orientation of the initial simplex has a significant effect |
137 | on efficiency, but the relationship can be too sensitive for an automatic | 143 | on efficiency, but the relationship can be too sensitive for an automatic |
@@ -140,7 +146,7 @@ predictor to provide sufficient accuracy at this time." | |||
140 | Since no initial simplex clearly improves on the others, in practice, | 146 | Since no initial simplex clearly improves on the others, in practice, |
141 | it may be convenient to try different approaches. | 147 | it may be convenient to try different approaches. |
142 | 148 | ||
143 | \subsection{Spendley's et al simplex} | 149 | \subsection{Spendley's et al regular simplex} |
144 | 150 | ||
145 | In their paper \cite{Spendley1962}, Spendley et al. use a regular | 151 | In their paper \cite{Spendley1962}, Spendley et al. use a regular |
146 | simplex with given size $\ell>0$. We define the parameters $p,q>0$ as | 152 | simplex with given size $\ell>0$. We define the parameters $p,q>0$ as |
@@ -165,10 +171,10 @@ x_i^j &=& | |||
165 | \right. | 171 | \right. |
166 | \end{eqnarray} | 172 | \end{eqnarray} |
167 | for vertices $i=2,n+1$ and components $j=1,n$, | 173 | for vertices $i=2,n+1$ and components $j=1,n$, |
168 | where $\ell \in\RR$ is the length of the simplex ($\ell>0$). Notice that this | 174 | where $\ell \in\RR$ is the length of the simplex and satisfies $\ell>0$. Notice that this |
169 | length is the same for all the vertices which keeps the simplex regular. | 175 | length is the same for all the edges which keeps the simplex regular. |
170 | 176 | ||
171 | The regular initial simplex is presented in figure \ref{fig-nm-simplex-regular}. | 177 | The regular simplex is presented in figure \ref{fig-nm-simplex-regular}. |
172 | 178 | ||
173 | \begin{figure} | 179 | \begin{figure} |
174 | \begin{center} | 180 | \begin{center} |
@@ -178,7 +184,7 @@ The regular initial simplex is presented in figure \ref{fig-nm-simplex-regular}. | |||
178 | \label{fig-nm-simplex-regular} | 184 | \label{fig-nm-simplex-regular} |
179 | \end{figure} | 185 | \end{figure} |
180 | 186 | ||
181 | \subsection{Simplex along the axes} | 187 | \subsection{Axis-by-axis simplex} |
182 | 188 | ||
183 | A very efficient and simple approach leads to an axis-by-axis simplex. | 189 | A very efficient and simple approach leads to an axis-by-axis simplex. |
184 | This simplex depends on a vector of positive lengths $\bl\in\RR^n$. | 190 | This simplex depends on a vector of positive lengths $\bl\in\RR^n$. |
@@ -191,15 +197,16 @@ The other vertices are defined by | |||
191 | x_i^j &=& | 197 | x_i^j &=& |
192 | \left\{ | 198 | \left\{ |
193 | \begin{array}{l} | 199 | \begin{array}{l} |
194 | \overline{x}_0^j + \bl_j, \textrm{ if } j=i-1,\\ | 200 | \overline{x}_0^j + \ell_j, \textrm{ if } j=i-1,\\ |
195 | \overline{x}_0^j, \textrm{ if } j\neq i-1,\\ | 201 | \overline{x}_0^j, \textrm{ if } j\neq i-1,\\ |
196 | \end{array} | 202 | \end{array} |
197 | \right. | 203 | \right. |
198 | \end{eqnarray} | 204 | \end{eqnarray} |
199 | for vertices $i=2,n+1$ and components $j=1,n$. | 205 | for vertices $i=2,n+1$ and components $j=1,n$. |
200 | 206 | ||
201 | This kind of simplex is presented in figure \ref{fig-nm-simplex-axes}. | 207 | This type of simplex is presented in figure \ref{fig-nm-simplex-axes}, |
202 | The axis-by-axis approach is used in the very popular Nelder-Mead | 208 | where $\ell_1=1$ and $\ell_2=2$. |
209 | The axis-by-axis simplex is used in the Nelder-Mead | ||
203 | algorithm provided in Numerical Recipes in C \cite{NumericalRecipes}. | 210 | algorithm provided in Numerical Recipes in C \cite{NumericalRecipes}. |
204 | As stated in \cite{NumericalRecipes}, the length vector $\bl$ can | 211 | As stated in \cite{NumericalRecipes}, the length vector $\bl$ can |
205 | be used as a guess for the characteristic length scale of the problem. | 212 | be used as a guess for the characteristic length scale of the problem. |
@@ -208,7 +215,8 @@ be used as a guess for the characteristic length scale of the problem. | |||
208 | \begin{center} | 215 | \begin{center} |
209 | \includegraphics[width=10cm]{simplex_axes.png} | 216 | \includegraphics[width=10cm]{simplex_axes.png} |
210 | \end{center} | 217 | \end{center} |
211 | \caption{Axis-based simplex in 2 dimensions} | 218 | \caption{Axis-based simplex in 2 dimensions -- Notice that the length along the $x$ axis is 1 while the length |
219 | along the $y$ axis is 2. } | ||
212 | \label{fig-nm-simplex-axes} | 220 | \label{fig-nm-simplex-axes} |
213 | \end{figure} | 221 | \end{figure} |
214 | 222 | ||
@@ -235,20 +243,19 @@ for vertices $i=2,n+1$ and components $j=1,n$. | |||
235 | \subsection{Pfeffer's method} | 243 | \subsection{Pfeffer's method} |
236 | 244 | ||
237 | This initial simplex is used in the function \scifunction{fminsearch} | 245 | This initial simplex is used in the function \scifunction{fminsearch} |
238 | and presented in \cite{Fan2002}. It is due to L. Pfeffer at Stanford. | 246 | and presented in \cite{Fan2002}. According to \cite{Fan2002}, this simplex is due to L. Pfeffer at Stanford. |
239 | The goal of this method is to scale the initial simplex with respect | 247 | The goal of this method is to scale the initial simplex with respect |
240 | to the characteristic lengths of the problem. This allows, for example, | 248 | to the characteristic lengths of the problem. This allows, for example, |
241 | to manage cases where $x_1\approx 1$ and $x_2\approx 10^5$. | 249 | to manage cases where $x_1\approx 1$ and $x_2\approx 10^5$. |
242 | As we are going to see, the scaling is defined with respect to the | 250 | As we are going to see, the scaling is defined with respect to the |
243 | initial guess $\bx_0$. Indeed, the initial simplex is created by | 251 | initial guess $\overline{\bx}_0$, with an axis-by-axis method. |
244 | small perturbations around the initial guess $\overline{\bx}_0$. | ||
245 | 252 | ||
246 | The method proceeds by defining $\delta_u,\delta_z>0$, where | 253 | The method proceeds by defining $\delta_u,\delta_z>0$, where |
247 | $\delta_u$ is used for usual components of $\overline{\bx}_0$ and $\delta_z$ is | 254 | $\delta_u$ is used for usual components of $\overline{\bx}_0$ and $\delta_z$ is |
248 | used for the case where one component of $\overline{\bx}_0$ is zero. | 255 | used for the case where one component of $\overline{\bx}_0$ is zero. |
249 | The default values for $\delta_u$ and $\delta_z$ are | 256 | The default values for $\delta_u$ and $\delta_z$ are |
250 | \begin{eqnarray} | 257 | \begin{eqnarray} |
251 | \delta_u = 0.05 \qquad \delta_z = 0.0075. | 258 | \delta_u = 0.05 \qquad \textrm{and} \qquad \delta_z = 0.0075. |
252 | \end{eqnarray} | 259 | \end{eqnarray} |
253 | The first vertex of the simplex is the initial guess | 260 | The first vertex of the simplex is the initial guess |
254 | \begin{eqnarray} | 261 | \begin{eqnarray} |
@@ -271,11 +278,11 @@ for vertices $i=2,n+1$ and components $j=1,n$. | |||
271 | 278 | ||
272 | %TODO... | 279 | %TODO... |
273 | 280 | ||
274 | %\section{References and notes} | 281 | \section{References and notes} |
275 | 282 | ||
276 | %Most of the section \ref{section-simplexgradient} is taken from | 283 | Some elements of the section \ref{section-simplexsize} is taken from |
277 | %Kelley's book \cite{Kelley1999}, "Iterative Methods for Optimization". | 284 | Kelley's book \cite{Kelley1999}, "Iterative Methods for Optimization". |
278 | %While this document focus on Nelder-Mead algorithm, Kelley gives a broad | 285 | While this document focus on Nelder-Mead algorithm, Kelley gives a broad |
279 | %view on optimization and present other algorithms for noisy functions, | 286 | view on optimization and present other algorithms for noisy functions, |
280 | %like implicit filtering, multidirectional search and the Hooke-Jeeves algorithm. | 287 | like implicit filtering, multidirectional search and the Hooke-Jeeves algorithm. |
281 | 288 | ||