[regression] note on evolution
This commit is contained in:
parent
17bf940101
commit
c2e4d4e40c
@ -23,20 +23,23 @@
|
|||||||
\item Fig 8.2 right: this should be a chi-squared distribution with one degree of freedom!
|
\item Fig 8.2 right: this should be a chi-squared distribution with one degree of freedom!
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
\subsection{Linear fits}
|
\subsection{New chapter: non-linear fits}
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Polyfit is easy: unique solution! $c x^2$ is also a linear fit.
|
\item Move 8.7 to this new chapter.
|
||||||
\item Example for overfitting with polyfit of a high order (=number of data points)
|
\item Example that illustrates the Nebenminima Problem (with error
|
||||||
\end{itemize}
|
surface). Maybe data generate from $1/x$ and fitted with
|
||||||
|
$\exp(\lambda x)$ induces local minima.
|
||||||
|
|
||||||
\subsection{Non-linear fits}
|
|
||||||
\begin{itemize}
|
|
||||||
\item Example that illustrates the Nebenminima Problem (with error surface)
|
|
||||||
\item You need initial values for the parameter!
|
\item You need initial values for the parameter!
|
||||||
\item Example that fitting gets harder the more parameter you have.
|
\item Example that fitting gets harder the more parameter you have.
|
||||||
\item Try to fix as many parameter before doing the fit.
|
\item Try to fix as many parameters before doing the fit.
|
||||||
\item How to test the quality of a fit? Residuals. $\chi^2$ test. Run-test.
|
\item How to test the quality of a fit? Residuals. $\chi^2$ test. Run-test.
|
||||||
|
\item Impoartant box: summary of fit howtos.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
|
\subsection{New chapter: linear fits --- generalized linear models}
|
||||||
|
\begin{itemize}
|
||||||
|
\item Polyfit is easy: unique solution! $c x^3$ is also a linear fit.
|
||||||
|
\item Example for \emph{overfitting} with polyfit of a high order (=number of data points)
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
|
||||||
|
|
||||||
|
@ -576,34 +576,36 @@ our tiger data-set (\figref{powergradientdescentfig}):
|
|||||||
|
|
||||||
\section{Fitting non-linear functions to data}
|
\section{Fitting non-linear functions to data}
|
||||||
|
|
||||||
The gradient descent is an important numerical method for solving
|
The gradient descent is a basic numerical method for solving
|
||||||
optimization problems. It is used to find the global minimum of an
|
optimization problems. It is used to find the global minimum of an
|
||||||
objective function.
|
objective function.
|
||||||
|
|
||||||
Curve fitting is a common application for the gradient descent method.
|
Curve fitting is a specific optimization problem and a common
|
||||||
For the case of fitting straight lines to data pairs, the error
|
application for the gradient descent method. For the case of fitting
|
||||||
surface (using the mean squared error) has exactly one clearly defined
|
straight lines to data pairs, the error surface (using the mean
|
||||||
global minimum. In fact, the position of the minimum can be
|
squared error) has exactly one clearly defined global minimum. In
|
||||||
analytically calculated as shown in the next chapter. For linear
|
fact, the position of the minimum can be analytically calculated as
|
||||||
fitting problems numerical methods like the gradient descent are not
|
shown in the next chapter. For linear fitting problems numerical
|
||||||
needed.
|
methods like the gradient descent are not needed.
|
||||||
|
|
||||||
Fitting problems that involve nonlinear functions of the parameters,
|
Fitting problems that involve nonlinear functions of the parameters,
|
||||||
e.g. the power law \eqref{powerfunc} or the exponential function
|
e.g. the power law \eqref{powerfunc} or the exponential function
|
||||||
$f(x;\lambda) = e^{\lambda x}$, do not have an analytical solution for
|
$f(t;\tau) = e^{-t/\tau}$, do in general not have an analytical
|
||||||
the least squares. To find the least squares for such functions
|
solution for the least squares. To find the least squares for such
|
||||||
numerical methods such as the gradient descent have to be applied.
|
functions numerical methods such as the gradient descent have to be
|
||||||
|
applied.
|
||||||
The suggested gradient descent algorithm is quite fragile and requires
|
|
||||||
manually tuned values for $\epsilon$ and the threshold for terminating
|
The suggested gradient descent algorithm requires manually tuned
|
||||||
the iteration. The algorithm can be improved in multiple ways to
|
values for $\epsilon$ and the threshold for terminating the iteration.
|
||||||
converge more robustly and faster. For example one could adapt the
|
The algorithm can be improved in multiple ways to converge more
|
||||||
step size to the length of the gradient. These numerical tricks have
|
robustly and faster. Most importantly, $\epsilon$ is made dependent
|
||||||
already been implemented in pre-defined functions. Generic
|
on the changes of the gradient from one iteration to the next. These
|
||||||
optimization functions such as \mcode{fminsearch()} have been
|
and other numerical tricks have already been implemented in
|
||||||
implemented for arbitrary objective functions, while the more
|
pre-defined functions. Generic optimization functions such as
|
||||||
specialized function \mcode{lsqcurvefit()} is specifically designed
|
\mcode{fminsearch()} have been implemented for arbitrary objective
|
||||||
for optimizations in the least square error sense.
|
functions, while the more specialized function \mcode{lsqcurvefit()}
|
||||||
|
is specifically designed for optimizations in the least square error
|
||||||
|
sense.
|
||||||
|
|
||||||
\begin{exercise}{plotlsqcurvefitpower.m}{}
|
\begin{exercise}{plotlsqcurvefitpower.m}{}
|
||||||
Use the \matlab-function \varcode{lsqcurvefit()} instead of
|
Use the \matlab-function \varcode{lsqcurvefit()} instead of
|
||||||
@ -626,5 +628,62 @@ for optimizations in the least square error sense.
|
|||||||
\end{important}
|
\end{important}
|
||||||
|
|
||||||
|
|
||||||
|
\section{Evolution as an optimization problem}
|
||||||
|
|
||||||
|
Evolution is a biological implementation of an optimization
|
||||||
|
algorithm. The objective function is an organism's fitness. This needs
|
||||||
|
to be maximized (this is the same as minimizing the negative fitness).
|
||||||
|
The parameters of this optimization problem are all the many genes on
|
||||||
|
the DNA. This is a very high-dimensional optimization problem. By
|
||||||
|
cross-over and mutations a population of a species moves along the
|
||||||
|
high-dimensional parameter space. Selection processes make sure that
|
||||||
|
only organisms with higher fitness pass on their genes to the next
|
||||||
|
generations. In this way the algorithm is not directed towards higher
|
||||||
|
fitness, as the gradient descent method would be. Rather, some
|
||||||
|
neighborhood of the parameter space is randomly probed. That way it is
|
||||||
|
even possible to escape a local maximum and find a potentially better
|
||||||
|
maximum. For this reason, \enterm{genetic algorithms} try to mimic
|
||||||
|
evolution in the context of high-dimensional optimization problems, in
|
||||||
|
particular with discrete parameter values. In biological evolution,
|
||||||
|
the objective function, however, is not a fixed function. It may
|
||||||
|
change in time by changing abiotic and biotic environmental
|
||||||
|
conditions, making this a very complex but also interesting
|
||||||
|
optimization problem.
|
||||||
|
|
||||||
|
How should a neuron or neural network be designed? As a particular
|
||||||
|
aspect of the general evolution of a species, this is a fundamental
|
||||||
|
question in the neurosciences. Maintaining a neural system is
|
||||||
|
costly. By their simple presence neurons incur costs. They need to be
|
||||||
|
built and maintained, they occupy space and consume
|
||||||
|
resources. Equipping a neuron with more ion channels also costs. And
|
||||||
|
neural activity makes it more costly to maintain concentration
|
||||||
|
gradients of ions. This all boils down to the consumption of more ATP,
|
||||||
|
the currency of metabolism. On the other hand each neuron provides
|
||||||
|
some useful function. In the end, neurons make the organism to behave
|
||||||
|
in some sensible way to increase the overall fitness of the
|
||||||
|
organism. On the level of neurons that means that they should
|
||||||
|
faithfully represent and process behaviorally relevant sensory
|
||||||
|
stimuli, make a sensible decision, store some important memory, or
|
||||||
|
initiate and control movements in a directed way. Unfortunately there
|
||||||
|
is a tradeoff. Better neural function usually involves higher
|
||||||
|
costs. More ion channels reduce intrinsic noise which usually favors
|
||||||
|
the precision of neural responses. Higher neuronal activity improves
|
||||||
|
the quality of the encoding of sensory stimuli. More neurons are
|
||||||
|
required for more complex computations. And so on.
|
||||||
|
|
||||||
|
Understanding why a neuronal system is designed in some specific way
|
||||||
|
requires to also understand these tradeoffs. The number of neurons,
|
||||||
|
the number and types of ion channels, the length of axons, the number
|
||||||
|
of synapses, the way neurons are connected, etc. are all parameters
|
||||||
|
that could be optimized. For the objective function the function of
|
||||||
|
the neurons needs to be quantified, for example by measures from
|
||||||
|
information or detection theory, and their dependence on the
|
||||||
|
parameters. From these benefits the costs need to be subtracted. And
|
||||||
|
then one is interested in finding the maximum of the resulting
|
||||||
|
objective function. Maximization (or minimization) problems are not
|
||||||
|
only a tool for data analysis, rather they are at the core of many ---
|
||||||
|
not only biological or neuroscientific --- problems.
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\printsolutions
|
\printsolutions
|
||||||
|
Reference in New Issue
Block a user