[regression] note on evolution
This commit is contained in:
parent
17bf940101
commit
c2e4d4e40c
@ -23,20 +23,23 @@
|
||||
\item Fig 8.2 right: this should be a chi-squared distribution with one degree of freedom!
|
||||
\end{itemize}
|
||||
|
||||
\subsection{Linear fits}
|
||||
\subsection{New chapter: non-linear fits}
|
||||
\begin{itemize}
|
||||
\item Polyfit is easy: unique solution! $c x^2$ is also a linear fit.
|
||||
\item Example for overfitting with polyfit of a high order (=number of data points)
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsection{Non-linear fits}
|
||||
\begin{itemize}
|
||||
\item Example that illustrates the Nebenminima Problem (with error surface)
|
||||
\item Move 8.7 to this new chapter.
|
||||
\item Example that illustrates the Nebenminima Problem (with error
|
||||
surface). Maybe data generate from $1/x$ and fitted with
|
||||
$\exp(\lambda x)$ induces local minima.
|
||||
\item You need initial values for the parameter!
|
||||
\item Example that fitting gets harder the more parameter you have.
|
||||
\item Try to fix as many parameter before doing the fit.
|
||||
\item Try to fix as many parameters before doing the fit.
|
||||
\item How to test the quality of a fit? Residuals. $\chi^2$ test. Run-test.
|
||||
\item Impoartant box: summary of fit howtos.
|
||||
\end{itemize}
|
||||
|
||||
\subsection{New chapter: linear fits --- generalized linear models}
|
||||
\begin{itemize}
|
||||
\item Polyfit is easy: unique solution! $c x^3$ is also a linear fit.
|
||||
\item Example for \emph{overfitting} with polyfit of a high order (=number of data points)
|
||||
\end{itemize}
|
||||
|
||||
|
||||
|
@ -576,34 +576,36 @@ our tiger data-set (\figref{powergradientdescentfig}):
|
||||
|
||||
\section{Fitting non-linear functions to data}
|
||||
|
||||
The gradient descent is an important numerical method for solving
|
||||
The gradient descent is a basic numerical method for solving
|
||||
optimization problems. It is used to find the global minimum of an
|
||||
objective function.
|
||||
|
||||
Curve fitting is a common application for the gradient descent method.
|
||||
For the case of fitting straight lines to data pairs, the error
|
||||
surface (using the mean squared error) has exactly one clearly defined
|
||||
global minimum. In fact, the position of the minimum can be
|
||||
analytically calculated as shown in the next chapter. For linear
|
||||
fitting problems numerical methods like the gradient descent are not
|
||||
needed.
|
||||
Curve fitting is a specific optimization problem and a common
|
||||
application for the gradient descent method. For the case of fitting
|
||||
straight lines to data pairs, the error surface (using the mean
|
||||
squared error) has exactly one clearly defined global minimum. In
|
||||
fact, the position of the minimum can be analytically calculated as
|
||||
shown in the next chapter. For linear fitting problems numerical
|
||||
methods like the gradient descent are not needed.
|
||||
|
||||
Fitting problems that involve nonlinear functions of the parameters,
|
||||
e.g. the power law \eqref{powerfunc} or the exponential function
|
||||
$f(x;\lambda) = e^{\lambda x}$, do not have an analytical solution for
|
||||
the least squares. To find the least squares for such functions
|
||||
numerical methods such as the gradient descent have to be applied.
|
||||
|
||||
The suggested gradient descent algorithm is quite fragile and requires
|
||||
manually tuned values for $\epsilon$ and the threshold for terminating
|
||||
the iteration. The algorithm can be improved in multiple ways to
|
||||
converge more robustly and faster. For example one could adapt the
|
||||
step size to the length of the gradient. These numerical tricks have
|
||||
already been implemented in pre-defined functions. Generic
|
||||
optimization functions such as \mcode{fminsearch()} have been
|
||||
implemented for arbitrary objective functions, while the more
|
||||
specialized function \mcode{lsqcurvefit()} is specifically designed
|
||||
for optimizations in the least square error sense.
|
||||
$f(t;\tau) = e^{-t/\tau}$, do in general not have an analytical
|
||||
solution for the least squares. To find the least squares for such
|
||||
functions numerical methods such as the gradient descent have to be
|
||||
applied.
|
||||
|
||||
The suggested gradient descent algorithm requires manually tuned
|
||||
values for $\epsilon$ and the threshold for terminating the iteration.
|
||||
The algorithm can be improved in multiple ways to converge more
|
||||
robustly and faster. Most importantly, $\epsilon$ is made dependent
|
||||
on the changes of the gradient from one iteration to the next. These
|
||||
and other numerical tricks have already been implemented in
|
||||
pre-defined functions. Generic optimization functions such as
|
||||
\mcode{fminsearch()} have been implemented for arbitrary objective
|
||||
functions, while the more specialized function \mcode{lsqcurvefit()}
|
||||
is specifically designed for optimizations in the least square error
|
||||
sense.
|
||||
|
||||
\begin{exercise}{plotlsqcurvefitpower.m}{}
|
||||
Use the \matlab-function \varcode{lsqcurvefit()} instead of
|
||||
@ -626,5 +628,62 @@ for optimizations in the least square error sense.
|
||||
\end{important}
|
||||
|
||||
|
||||
\section{Evolution as an optimization problem}
|
||||
|
||||
Evolution is a biological implementation of an optimization
|
||||
algorithm. The objective function is an organism's fitness. This needs
|
||||
to be maximized (this is the same as minimizing the negative fitness).
|
||||
The parameters of this optimization problem are all the many genes on
|
||||
the DNA. This is a very high-dimensional optimization problem. By
|
||||
cross-over and mutations a population of a species moves along the
|
||||
high-dimensional parameter space. Selection processes make sure that
|
||||
only organisms with higher fitness pass on their genes to the next
|
||||
generations. In this way the algorithm is not directed towards higher
|
||||
fitness, as the gradient descent method would be. Rather, some
|
||||
neighborhood of the parameter space is randomly probed. That way it is
|
||||
even possible to escape a local maximum and find a potentially better
|
||||
maximum. For this reason, \enterm{genetic algorithms} try to mimic
|
||||
evolution in the context of high-dimensional optimization problems, in
|
||||
particular with discrete parameter values. In biological evolution,
|
||||
the objective function, however, is not a fixed function. It may
|
||||
change in time by changing abiotic and biotic environmental
|
||||
conditions, making this a very complex but also interesting
|
||||
optimization problem.
|
||||
|
||||
How should a neuron or neural network be designed? As a particular
|
||||
aspect of the general evolution of a species, this is a fundamental
|
||||
question in the neurosciences. Maintaining a neural system is
|
||||
costly. By their simple presence neurons incur costs. They need to be
|
||||
built and maintained, they occupy space and consume
|
||||
resources. Equipping a neuron with more ion channels also costs. And
|
||||
neural activity makes it more costly to maintain concentration
|
||||
gradients of ions. This all boils down to the consumption of more ATP,
|
||||
the currency of metabolism. On the other hand each neuron provides
|
||||
some useful function. In the end, neurons make the organism to behave
|
||||
in some sensible way to increase the overall fitness of the
|
||||
organism. On the level of neurons that means that they should
|
||||
faithfully represent and process behaviorally relevant sensory
|
||||
stimuli, make a sensible decision, store some important memory, or
|
||||
initiate and control movements in a directed way. Unfortunately there
|
||||
is a tradeoff. Better neural function usually involves higher
|
||||
costs. More ion channels reduce intrinsic noise which usually favors
|
||||
the precision of neural responses. Higher neuronal activity improves
|
||||
the quality of the encoding of sensory stimuli. More neurons are
|
||||
required for more complex computations. And so on.
|
||||
|
||||
Understanding why a neuronal system is designed in some specific way
|
||||
requires to also understand these tradeoffs. The number of neurons,
|
||||
the number and types of ion channels, the length of axons, the number
|
||||
of synapses, the way neurons are connected, etc. are all parameters
|
||||
that could be optimized. For the objective function the function of
|
||||
the neurons needs to be quantified, for example by measures from
|
||||
information or detection theory, and their dependence on the
|
||||
parameters. From these benefits the costs need to be subtracted. And
|
||||
then one is interested in finding the maximum of the resulting
|
||||
objective function. Maximization (or minimization) problems are not
|
||||
only a tool for data analysis, rather they are at the core of many ---
|
||||
not only biological or neuroscientific --- problems.
|
||||
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\printsolutions
|
||||
|
Reference in New Issue
Block a user