[regression] gradient descent code

This commit is contained in:
Jan Benda 2020-12-19 12:20:18 +01:00
parent 4b624fe981
commit 7518f9dd47
3 changed files with 99 additions and 29 deletions

View File

@ -0,0 +1,25 @@
function [c, cs, mses] = gradientDescentCubic(x, y, c0, epsilon, threshold)
% Gradient descent for fitting a cubic relation.
%
% Arguments: x, vector of the x-data values.
% y, vector of the corresponding y-data values.
% c0, initial value for the parameter c.
% epsilon: factor multiplying the gradient.
% threshold: minimum value for gradient
%
% Returns: c, the final value of the c-parameter.
% cs: vector with all the c-values traversed.
% mses: vector with the corresponding mean squared errors
c = c0;
gradient = 1000.0;
cs = [];
mses = [];
count = 1;
while abs(gradient) > threshold
cs(count) = c;
mses(count) = meanSquaredErrorCubic(x, y, c);
gradient = meanSquaredGradientCubic(x, y, c);
c = c - epsilon * gradient;
count = count + 1;
end
end

View File

@ -0,0 +1,29 @@
meansquarederrorline % generate data
c0 = 2.0;
eps = 0.0001;
thresh = 0.1;
[cest, cs, mses] = gradientDescentCubic(x, y, c0, eps, thresh);
subplot(2, 2, 1); % top left panel
hold on;
plot(cs, '-o');
plot([1, length(cs)], [c, c], 'k'); % line indicating true c value
hold off;
xlabel('Iteration');
ylabel('C');
subplot(2, 2, 3); % bottom left panel
plot(mses, '-o');
xlabel('Iteration steps');
ylabel('MSE');
subplot(1, 2, 2); % right panel
hold on;
% generate x-values for plottig the fit:
xx = min(x):0.01:max(x);
yy = cest * xx.^3;
plot(xx, yy, 'displayname', 'fit');
plot(x, y, 'o', 'displayname', 'data'); % plot original data
xlabel('Size [m]');
ylabel('Weight [kg]');
legend("location", "northwest");
pause

View File

@ -269,9 +269,21 @@ the hill, we choose the opposite direction.
\end{exercise}
\section{Gradient descent}
\begin{figure}[t]
\includegraphics{cubicmse}
\titlecaption{Gradient descent.}{The algorithm starts at an
arbitrary position. At each point the gradient is estimated and
the position is updated as long as the length of the gradient is
sufficiently large.The dots show the positions after each
iteration of the algorithm.} \label{gradientdescentcubicfig}
\end{figure}
Finally, we are able to implement the optimization itself. By now it
should be obvious why it is called the gradient descent method. All
ingredients are already there. We need: (i) the cost function
should be obvious why it is called the gradient descent method. From a
starting position on we iteratively walk down the slope of the cost
function against its gradient. All ingredients necessary for this
algorithm are already there. We need: (i) the cost function
(\varcode{meanSquaredErrorCubic()}), and (ii) the gradient
(\varcode{meanSquaredGradientCubic()}). The algorithm of the gradient
descent works as follows:
@ -292,41 +304,45 @@ descent works as follows:
\item \label{gradientstep} If the length of the gradient exceeds the
threshold we take a small step into the opposite direction:
\begin{equation}
\label{gradientdescent}
p_{i+1} = p_i - \epsilon \cdot \nabla f_{cost}(p_i)
\end{equation}
where $\epsilon = 0.01$ is a factor linking the gradient to
where $\epsilon$ is a factor linking the gradient to
appropriate steps in the parameter space.
\item Repeat steps \ref{computegradient} -- \ref{gradientstep}.
\end{enumerate}
\Figref{gradientdescentfig} illustrates the gradient descent --- the
path the imaginary ball has chosen to reach the minimum. Starting at
an arbitrary position we change the position as long as the gradient
at that position is larger than a certain threshold. If the slope is
very steep, the change in the position (the distance between the red
dots in \figref{gradientdescentfig}) is large.
\begin{figure}[t]
\includegraphics{cubicmse}
\titlecaption{Gradient descent.}{The algorithm starts at an
arbitrary position. At each point the gradient is estimated and
the position is updated as long as the length of the gradient is
sufficiently large.The dots show the positions after each
iteration of the algorithm.} \label{gradientdescentfig}
\end{figure}
\begin{exercise}{gradientDescent.m}{}
Implement the gradient descent for the problem of fitting a straight
line to some measured data. Reuse the data generated in
exercise~\ref{errorsurfaceexercise}.
\begin{enumerate}
\item Store for each iteration the error value.
\item Plot the error values as a function of the iterations, the
number of optimization steps.
\item Plot the measured data together with the best fitting straight line.
\end{enumerate}\vspace{-4.5ex}
\Figref{gradientdescentcubicfig} illustrates the gradient descent --- the
path the imaginary ball has chosen to reach the minimum. We walk along
the parameter axis against the gradient as long as the gradient
differs sufficiently from zero. At steep slopes we take large steps
(the distance between the red dots in \figref{gradientdescentcubicfig}) is
large.
\begin{exercise}{gradientDescentCubic.m}{}
Implement the gradient descent algorithm for the problem of fitting
a cubic function \eqref{cubicfunc} to some measured data pairs $x$
and $y$ as a function \varcode{gradientDescentCubic()} that returns
the estimated best fitting parameter value $c$ as well as two
vectors with all the parameter values and the corresponding values
of the cost function that the algorithm iterated trough. As
additional arguments that function takes the initial value for the
parameter $c$, the factor $\epsilon$ connecting the gradient with
iteration steps in \eqnref{gradientdescent}, and the threshold value
for the absolute value of the gradient terminating the algorithm.
\end{exercise}
\begin{exercise}{plotgradientdescentcubic.m}{}
Use the function \varcode{gradientDescentCubic()} to fit the
simulated data from exercise~\ref{mseexercise}. Plot the returned
values of the parameter $c$ as as well as the corresponding mean
squared errors as a function of iteration step (two plots). Compare
the result of the gradient descent method with the true value of $c$
used to simulate the data. Inspect the plots and adapt $\epsilon$
and the threshold to make the algorithm behave as intended. Also
plot the data together with the best fitting cubic relation
\eqref{cubicfunc}.
\end{exercise}
\begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}
Some functions that depend on more than a single variable: