[regression] gradient descent code
This commit is contained in:
@@ -269,9 +269,21 @@ the hill, we choose the opposite direction.
|
||||
\end{exercise}
|
||||
|
||||
\section{Gradient descent}
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics{cubicmse}
|
||||
\titlecaption{Gradient descent.}{The algorithm starts at an
|
||||
arbitrary position. At each point the gradient is estimated and
|
||||
the position is updated as long as the length of the gradient is
|
||||
sufficiently large.The dots show the positions after each
|
||||
iteration of the algorithm.} \label{gradientdescentcubicfig}
|
||||
\end{figure}
|
||||
|
||||
Finally, we are able to implement the optimization itself. By now it
|
||||
should be obvious why it is called the gradient descent method. All
|
||||
ingredients are already there. We need: (i) the cost function
|
||||
should be obvious why it is called the gradient descent method. From a
|
||||
starting position on we iteratively walk down the slope of the cost
|
||||
function against its gradient. All ingredients necessary for this
|
||||
algorithm are already there. We need: (i) the cost function
|
||||
(\varcode{meanSquaredErrorCubic()}), and (ii) the gradient
|
||||
(\varcode{meanSquaredGradientCubic()}). The algorithm of the gradient
|
||||
descent works as follows:
|
||||
@@ -292,41 +304,45 @@ descent works as follows:
|
||||
\item \label{gradientstep} If the length of the gradient exceeds the
|
||||
threshold we take a small step into the opposite direction:
|
||||
\begin{equation}
|
||||
\label{gradientdescent}
|
||||
p_{i+1} = p_i - \epsilon \cdot \nabla f_{cost}(p_i)
|
||||
\end{equation}
|
||||
where $\epsilon = 0.01$ is a factor linking the gradient to
|
||||
where $\epsilon$ is a factor linking the gradient to
|
||||
appropriate steps in the parameter space.
|
||||
\item Repeat steps \ref{computegradient} -- \ref{gradientstep}.
|
||||
\end{enumerate}
|
||||
|
||||
\Figref{gradientdescentfig} illustrates the gradient descent --- the
|
||||
path the imaginary ball has chosen to reach the minimum. Starting at
|
||||
an arbitrary position we change the position as long as the gradient
|
||||
at that position is larger than a certain threshold. If the slope is
|
||||
very steep, the change in the position (the distance between the red
|
||||
dots in \figref{gradientdescentfig}) is large.
|
||||
\Figref{gradientdescentcubicfig} illustrates the gradient descent --- the
|
||||
path the imaginary ball has chosen to reach the minimum. We walk along
|
||||
the parameter axis against the gradient as long as the gradient
|
||||
differs sufficiently from zero. At steep slopes we take large steps
|
||||
(the distance between the red dots in \figref{gradientdescentcubicfig}) is
|
||||
large.
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics{cubicmse}
|
||||
\titlecaption{Gradient descent.}{The algorithm starts at an
|
||||
arbitrary position. At each point the gradient is estimated and
|
||||
the position is updated as long as the length of the gradient is
|
||||
sufficiently large.The dots show the positions after each
|
||||
iteration of the algorithm.} \label{gradientdescentfig}
|
||||
\end{figure}
|
||||
|
||||
\begin{exercise}{gradientDescent.m}{}
|
||||
Implement the gradient descent for the problem of fitting a straight
|
||||
line to some measured data. Reuse the data generated in
|
||||
exercise~\ref{errorsurfaceexercise}.
|
||||
\begin{enumerate}
|
||||
\item Store for each iteration the error value.
|
||||
\item Plot the error values as a function of the iterations, the
|
||||
number of optimization steps.
|
||||
\item Plot the measured data together with the best fitting straight line.
|
||||
\end{enumerate}\vspace{-4.5ex}
|
||||
\begin{exercise}{gradientDescentCubic.m}{}
|
||||
Implement the gradient descent algorithm for the problem of fitting
|
||||
a cubic function \eqref{cubicfunc} to some measured data pairs $x$
|
||||
and $y$ as a function \varcode{gradientDescentCubic()} that returns
|
||||
the estimated best fitting parameter value $c$ as well as two
|
||||
vectors with all the parameter values and the corresponding values
|
||||
of the cost function that the algorithm iterated trough. As
|
||||
additional arguments that function takes the initial value for the
|
||||
parameter $c$, the factor $\epsilon$ connecting the gradient with
|
||||
iteration steps in \eqnref{gradientdescent}, and the threshold value
|
||||
for the absolute value of the gradient terminating the algorithm.
|
||||
\end{exercise}
|
||||
|
||||
\begin{exercise}{plotgradientdescentcubic.m}{}
|
||||
Use the function \varcode{gradientDescentCubic()} to fit the
|
||||
simulated data from exercise~\ref{mseexercise}. Plot the returned
|
||||
values of the parameter $c$ as as well as the corresponding mean
|
||||
squared errors as a function of iteration step (two plots). Compare
|
||||
the result of the gradient descent method with the true value of $c$
|
||||
used to simulate the data. Inspect the plots and adapt $\epsilon$
|
||||
and the threshold to make the algorithm behave as intended. Also
|
||||
plot the data together with the best fitting cubic relation
|
||||
\eqref{cubicfunc}.
|
||||
\end{exercise}
|
||||
|
||||
\begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}
|
||||
Some functions that depend on more than a single variable:
|
||||
|
||||
Reference in New Issue
Block a user