[regression] text and example for cost function figure

This commit is contained in:
Jan Benda 2020-12-18 13:10:53 +01:00
parent 99a8a9d91e
commit fefe1c3726
2 changed files with 34 additions and 24 deletions

View File

@ -0,0 +1,9 @@
cs = 2.0:0.1:8.0;
mses = zeros(length(cs));
for i = 1:length(cs)
mses(i) = meanSquaredErrorCubic(x, y, cs(i));
end
plot(cs, mses)
xlabel('c')
ylabel('mean squared error')

View File

@ -154,12 +154,11 @@ function $f_{cost}(c)$ that maps the parameter value $c$ to a scalar
error value. For a given data set we thus can simply plot the cost
function as a function of $c$ (\figref{cubiccostfig}).
\begin{exercise}{errorSurface.m}{}\label{errorsurfaceexercise}
Then calculate the mean squared error between the data and straight
lines for a range of slopes and intercepts using the
\begin{exercise}{plotcubiccosts.m}{}
Calculate the mean squared error between the data and the cubic
function for a range of values of the factor $c$ using the
\varcode{meanSquaredErrorCubic()} function from the previous
exercise. Illustrate the error surface using the \code{surface()}
function.
exercise. Plot the graph of the cost function.
\end{exercise}
\begin{figure}[t]
@ -177,20 +176,22 @@ function as a function of $c$ (\figref{cubiccostfig}).
By looking at the plot of the cost function we can visually identify
the position of the minimum and thus estimate the optimal value for
the parameter $c$. How can we use the error surface to guide an
the parameter $c$. How can we use the cost function to guide an
automatic optimization process?
The obvious approach would be to calculate the error surface for any
combination of slope and intercept values and then find the position
of the minimum using the \code{min} function. This approach, however
has several disadvantages: (i) it is computationally very expensive to
calculate the error for each parameter combination. The number of
combinations increases exponentially with the number of free
parameters (also known as the ``curse of dimensionality''). (ii) the
accuracy with which the best parameters can be estimated is limited by
the resolution used to sample the parameter space. The coarser the
parameters are sampled the less precise is the obtained position of
the minimum.
The obvious approach would be to calculate the mean squared error for
a range of parameter values and then find the position of the minimum
using the \code{min} function. This approach, however has several
disadvantages: (i) the accuracy of the estimation of the best
parameter is limited by the resolution used to sample the parameter
space. The coarser the parameters are sampled the less precise is the
obtained position of the minimum (\figref{cubiccostfig}, right). (ii)
the range of parameter values might not include the absolute minimum.
(iii) in particular for functions with more than a single free
parameter it is computationally expensive to calculate the cost
function for each parameter combination. The number of combinations
increases exponentially with the number of free parameters. This is
known as the \enterm{curse of dimensionality}.
So we need a different approach. We want a procedure that finds the
minimum of the cost function with a minimal number of computations and
@ -206,22 +207,22 @@ to arbitrary precision.
m = \frac{f(x + \Delta x) - f(x)}{\Delta x}
\end{equation}
of a function $y = f(x)$ is the slope of the secant (red) defined
by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ with the
by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ at
distance $\Delta x$.
The slope of the function $y=f(x)$ at the position $x$ (yellow) is
The slope of the function $y=f(x)$ at the position $x$ (orange) is
given by the derivative $f'(x)$ of the function at that position.
It is defined by the difference quotient in the limit of
infinitesimally (orange) small distances $\Delta x$:
infinitesimally (red and yellow) small distances $\Delta x$:
\begin{equation}
\label{derivative}
f'(x) = \frac{{\rm d} f(x)}{{\rm d}x} = \lim\limits_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x} \end{equation}
\end{minipage}\vspace{2ex}
It is not possible to calculate the derivative, \eqnref{derivative},
numerically. The derivative can only be estimated using the
difference quotient, \eqnref{difffrac}, by using sufficiently small
$\Delta x$.
It is not possible to calculate the exact value of the derivative,
\eqnref{derivative}, numerically. The derivative can only be
estimated by computing the difference quotient, \eqnref{difffrac}
using sufficiently small $\Delta x$.
\end{ibox}
\begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}