[regression] gradient descent code
This commit is contained in:
parent
4b624fe981
commit
7518f9dd47
25
regression/code/gradientDescentCubic.m
Normal file
25
regression/code/gradientDescentCubic.m
Normal file
@ -0,0 +1,25 @@
|
||||
function [c, cs, mses] = gradientDescentCubic(x, y, c0, epsilon, threshold)
|
||||
% Gradient descent for fitting a cubic relation.
|
||||
%
|
||||
% Arguments: x, vector of the x-data values.
|
||||
% y, vector of the corresponding y-data values.
|
||||
% c0, initial value for the parameter c.
|
||||
% epsilon: factor multiplying the gradient.
|
||||
% threshold: minimum value for gradient
|
||||
%
|
||||
% Returns: c, the final value of the c-parameter.
|
||||
% cs: vector with all the c-values traversed.
|
||||
% mses: vector with the corresponding mean squared errors
|
||||
c = c0;
|
||||
gradient = 1000.0;
|
||||
cs = [];
|
||||
mses = [];
|
||||
count = 1;
|
||||
while abs(gradient) > threshold
|
||||
cs(count) = c;
|
||||
mses(count) = meanSquaredErrorCubic(x, y, c);
|
||||
gradient = meanSquaredGradientCubic(x, y, c);
|
||||
c = c - epsilon * gradient;
|
||||
count = count + 1;
|
||||
end
|
||||
end
|
29
regression/code/plotgradientdescentcubic.m
Normal file
29
regression/code/plotgradientdescentcubic.m
Normal file
@ -0,0 +1,29 @@
|
||||
meansquarederrorline % generate data
|
||||
|
||||
c0 = 2.0;
|
||||
eps = 0.0001;
|
||||
thresh = 0.1;
|
||||
[cest, cs, mses] = gradientDescentCubic(x, y, c0, eps, thresh);
|
||||
|
||||
subplot(2, 2, 1); % top left panel
|
||||
hold on;
|
||||
plot(cs, '-o');
|
||||
plot([1, length(cs)], [c, c], 'k'); % line indicating true c value
|
||||
hold off;
|
||||
xlabel('Iteration');
|
||||
ylabel('C');
|
||||
subplot(2, 2, 3); % bottom left panel
|
||||
plot(mses, '-o');
|
||||
xlabel('Iteration steps');
|
||||
ylabel('MSE');
|
||||
subplot(1, 2, 2); % right panel
|
||||
hold on;
|
||||
% generate x-values for plottig the fit:
|
||||
xx = min(x):0.01:max(x);
|
||||
yy = cest * xx.^3;
|
||||
plot(xx, yy, 'displayname', 'fit');
|
||||
plot(x, y, 'o', 'displayname', 'data'); % plot original data
|
||||
xlabel('Size [m]');
|
||||
ylabel('Weight [kg]');
|
||||
legend("location", "northwest");
|
||||
pause
|
@ -269,9 +269,21 @@ the hill, we choose the opposite direction.
|
||||
\end{exercise}
|
||||
|
||||
\section{Gradient descent}
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics{cubicmse}
|
||||
\titlecaption{Gradient descent.}{The algorithm starts at an
|
||||
arbitrary position. At each point the gradient is estimated and
|
||||
the position is updated as long as the length of the gradient is
|
||||
sufficiently large.The dots show the positions after each
|
||||
iteration of the algorithm.} \label{gradientdescentcubicfig}
|
||||
\end{figure}
|
||||
|
||||
Finally, we are able to implement the optimization itself. By now it
|
||||
should be obvious why it is called the gradient descent method. All
|
||||
ingredients are already there. We need: (i) the cost function
|
||||
should be obvious why it is called the gradient descent method. From a
|
||||
starting position on we iteratively walk down the slope of the cost
|
||||
function against its gradient. All ingredients necessary for this
|
||||
algorithm are already there. We need: (i) the cost function
|
||||
(\varcode{meanSquaredErrorCubic()}), and (ii) the gradient
|
||||
(\varcode{meanSquaredGradientCubic()}). The algorithm of the gradient
|
||||
descent works as follows:
|
||||
@ -292,41 +304,45 @@ descent works as follows:
|
||||
\item \label{gradientstep} If the length of the gradient exceeds the
|
||||
threshold we take a small step into the opposite direction:
|
||||
\begin{equation}
|
||||
\label{gradientdescent}
|
||||
p_{i+1} = p_i - \epsilon \cdot \nabla f_{cost}(p_i)
|
||||
\end{equation}
|
||||
where $\epsilon = 0.01$ is a factor linking the gradient to
|
||||
where $\epsilon$ is a factor linking the gradient to
|
||||
appropriate steps in the parameter space.
|
||||
\item Repeat steps \ref{computegradient} -- \ref{gradientstep}.
|
||||
\end{enumerate}
|
||||
|
||||
\Figref{gradientdescentfig} illustrates the gradient descent --- the
|
||||
path the imaginary ball has chosen to reach the minimum. Starting at
|
||||
an arbitrary position we change the position as long as the gradient
|
||||
at that position is larger than a certain threshold. If the slope is
|
||||
very steep, the change in the position (the distance between the red
|
||||
dots in \figref{gradientdescentfig}) is large.
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics{cubicmse}
|
||||
\titlecaption{Gradient descent.}{The algorithm starts at an
|
||||
arbitrary position. At each point the gradient is estimated and
|
||||
the position is updated as long as the length of the gradient is
|
||||
sufficiently large.The dots show the positions after each
|
||||
iteration of the algorithm.} \label{gradientdescentfig}
|
||||
\end{figure}
|
||||
|
||||
\begin{exercise}{gradientDescent.m}{}
|
||||
Implement the gradient descent for the problem of fitting a straight
|
||||
line to some measured data. Reuse the data generated in
|
||||
exercise~\ref{errorsurfaceexercise}.
|
||||
\begin{enumerate}
|
||||
\item Store for each iteration the error value.
|
||||
\item Plot the error values as a function of the iterations, the
|
||||
number of optimization steps.
|
||||
\item Plot the measured data together with the best fitting straight line.
|
||||
\end{enumerate}\vspace{-4.5ex}
|
||||
\Figref{gradientdescentcubicfig} illustrates the gradient descent --- the
|
||||
path the imaginary ball has chosen to reach the minimum. We walk along
|
||||
the parameter axis against the gradient as long as the gradient
|
||||
differs sufficiently from zero. At steep slopes we take large steps
|
||||
(the distance between the red dots in \figref{gradientdescentcubicfig}) is
|
||||
large.
|
||||
|
||||
\begin{exercise}{gradientDescentCubic.m}{}
|
||||
Implement the gradient descent algorithm for the problem of fitting
|
||||
a cubic function \eqref{cubicfunc} to some measured data pairs $x$
|
||||
and $y$ as a function \varcode{gradientDescentCubic()} that returns
|
||||
the estimated best fitting parameter value $c$ as well as two
|
||||
vectors with all the parameter values and the corresponding values
|
||||
of the cost function that the algorithm iterated trough. As
|
||||
additional arguments that function takes the initial value for the
|
||||
parameter $c$, the factor $\epsilon$ connecting the gradient with
|
||||
iteration steps in \eqnref{gradientdescent}, and the threshold value
|
||||
for the absolute value of the gradient terminating the algorithm.
|
||||
\end{exercise}
|
||||
|
||||
\begin{exercise}{plotgradientdescentcubic.m}{}
|
||||
Use the function \varcode{gradientDescentCubic()} to fit the
|
||||
simulated data from exercise~\ref{mseexercise}. Plot the returned
|
||||
values of the parameter $c$ as as well as the corresponding mean
|
||||
squared errors as a function of iteration step (two plots). Compare
|
||||
the result of the gradient descent method with the true value of $c$
|
||||
used to simulate the data. Inspect the plots and adapt $\epsilon$
|
||||
and the threshold to make the algorithm behave as intended. Also
|
||||
plot the data together with the best fitting cubic relation
|
||||
\eqref{cubicfunc}.
|
||||
\end{exercise}
|
||||
|
||||
\begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}
|
||||
Some functions that depend on more than a single variable:
|
||||
|
Reference in New Issue
Block a user