[regression] note on evolution

This commit is contained in:
Jan Benda 2020-12-20 21:24:46 +01:00
parent 17bf940101
commit c2e4d4e40c
2 changed files with 94 additions and 32 deletions

View File

@ -23,20 +23,23 @@
\item Fig 8.2 right: this should be a chi-squared distribution with one degree of freedom!
\end{itemize}
\subsection{Linear fits}
\subsection{New chapter: non-linear fits}
\begin{itemize}
\item Polyfit is easy: unique solution! $c x^2$ is also a linear fit.
\item Example for overfitting with polyfit of a high order (=number of data points)
\end{itemize}
\subsection{Non-linear fits}
\begin{itemize}
\item Example that illustrates the Nebenminima Problem (with error surface)
\item Move 8.7 to this new chapter.
\item Example that illustrates the Nebenminima Problem (with error
surface). Maybe data generate from $1/x$ and fitted with
$\exp(\lambda x)$ induces local minima.
\item You need initial values for the parameter!
\item Example that fitting gets harder the more parameter you have.
\item Try to fix as many parameter before doing the fit.
\item Try to fix as many parameters before doing the fit.
\item How to test the quality of a fit? Residuals. $\chi^2$ test. Run-test.
\item Impoartant box: summary of fit howtos.
\end{itemize}
\subsection{New chapter: linear fits --- generalized linear models}
\begin{itemize}
\item Polyfit is easy: unique solution! $c x^3$ is also a linear fit.
\item Example for \emph{overfitting} with polyfit of a high order (=number of data points)
\end{itemize}

View File

@ -576,34 +576,36 @@ our tiger data-set (\figref{powergradientdescentfig}):
\section{Fitting non-linear functions to data}
The gradient descent is an important numerical method for solving
The gradient descent is a basic numerical method for solving
optimization problems. It is used to find the global minimum of an
objective function.
Curve fitting is a common application for the gradient descent method.
For the case of fitting straight lines to data pairs, the error
surface (using the mean squared error) has exactly one clearly defined
global minimum. In fact, the position of the minimum can be
analytically calculated as shown in the next chapter. For linear
fitting problems numerical methods like the gradient descent are not
needed.
Curve fitting is a specific optimization problem and a common
application for the gradient descent method. For the case of fitting
straight lines to data pairs, the error surface (using the mean
squared error) has exactly one clearly defined global minimum. In
fact, the position of the minimum can be analytically calculated as
shown in the next chapter. For linear fitting problems numerical
methods like the gradient descent are not needed.
Fitting problems that involve nonlinear functions of the parameters,
e.g. the power law \eqref{powerfunc} or the exponential function
$f(x;\lambda) = e^{\lambda x}$, do not have an analytical solution for
the least squares. To find the least squares for such functions
numerical methods such as the gradient descent have to be applied.
The suggested gradient descent algorithm is quite fragile and requires
manually tuned values for $\epsilon$ and the threshold for terminating
the iteration. The algorithm can be improved in multiple ways to
converge more robustly and faster. For example one could adapt the
step size to the length of the gradient. These numerical tricks have
already been implemented in pre-defined functions. Generic
optimization functions such as \mcode{fminsearch()} have been
implemented for arbitrary objective functions, while the more
specialized function \mcode{lsqcurvefit()} is specifically designed
for optimizations in the least square error sense.
$f(t;\tau) = e^{-t/\tau}$, do in general not have an analytical
solution for the least squares. To find the least squares for such
functions numerical methods such as the gradient descent have to be
applied.
The suggested gradient descent algorithm requires manually tuned
values for $\epsilon$ and the threshold for terminating the iteration.
The algorithm can be improved in multiple ways to converge more
robustly and faster. Most importantly, $\epsilon$ is made dependent
on the changes of the gradient from one iteration to the next. These
and other numerical tricks have already been implemented in
pre-defined functions. Generic optimization functions such as
\mcode{fminsearch()} have been implemented for arbitrary objective
functions, while the more specialized function \mcode{lsqcurvefit()}
is specifically designed for optimizations in the least square error
sense.
\begin{exercise}{plotlsqcurvefitpower.m}{}
Use the \matlab-function \varcode{lsqcurvefit()} instead of
@ -626,5 +628,62 @@ for optimizations in the least square error sense.
\end{important}
\section{Evolution as an optimization problem}
Evolution is a biological implementation of an optimization
algorithm. The objective function is an organism's fitness. This needs
to be maximized (this is the same as minimizing the negative fitness).
The parameters of this optimization problem are all the many genes on
the DNA. This is a very high-dimensional optimization problem. By
cross-over and mutations a population of a species moves along the
high-dimensional parameter space. Selection processes make sure that
only organisms with higher fitness pass on their genes to the next
generations. In this way the algorithm is not directed towards higher
fitness, as the gradient descent method would be. Rather, some
neighborhood of the parameter space is randomly probed. That way it is
even possible to escape a local maximum and find a potentially better
maximum. For this reason, \enterm{genetic algorithms} try to mimic
evolution in the context of high-dimensional optimization problems, in
particular with discrete parameter values. In biological evolution,
the objective function, however, is not a fixed function. It may
change in time by changing abiotic and biotic environmental
conditions, making this a very complex but also interesting
optimization problem.
How should a neuron or neural network be designed? As a particular
aspect of the general evolution of a species, this is a fundamental
question in the neurosciences. Maintaining a neural system is
costly. By their simple presence neurons incur costs. They need to be
built and maintained, they occupy space and consume
resources. Equipping a neuron with more ion channels also costs. And
neural activity makes it more costly to maintain concentration
gradients of ions. This all boils down to the consumption of more ATP,
the currency of metabolism. On the other hand each neuron provides
some useful function. In the end, neurons make the organism to behave
in some sensible way to increase the overall fitness of the
organism. On the level of neurons that means that they should
faithfully represent and process behaviorally relevant sensory
stimuli, make a sensible decision, store some important memory, or
initiate and control movements in a directed way. Unfortunately there
is a tradeoff. Better neural function usually involves higher
costs. More ion channels reduce intrinsic noise which usually favors
the precision of neural responses. Higher neuronal activity improves
the quality of the encoding of sensory stimuli. More neurons are
required for more complex computations. And so on.
Understanding why a neuronal system is designed in some specific way
requires to also understand these tradeoffs. The number of neurons,
the number and types of ion channels, the length of axons, the number
of synapses, the way neurons are connected, etc. are all parameters
that could be optimized. For the objective function the function of
the neurons needs to be quantified, for example by measures from
information or detection theory, and their dependence on the
parameters. From these benefits the costs need to be subtracted. And
then one is interested in finding the maximum of the resulting
objective function. Maximization (or minimization) problems are not
only a tool for data analysis, rather they are at the core of many ---
not only biological or neuroscientific --- problems.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\printsolutions