[regression] text and example for cost function figure
This commit is contained in:
parent
99a8a9d91e
commit
fefe1c3726
9
regression/code/plotcubiccosts.m
Normal file
9
regression/code/plotcubiccosts.m
Normal file
@ -0,0 +1,9 @@
|
||||
cs = 2.0:0.1:8.0;
|
||||
mses = zeros(length(cs));
|
||||
for i = 1:length(cs)
|
||||
mses(i) = meanSquaredErrorCubic(x, y, cs(i));
|
||||
end
|
||||
|
||||
plot(cs, mses)
|
||||
xlabel('c')
|
||||
ylabel('mean squared error')
|
@ -154,12 +154,11 @@ function $f_{cost}(c)$ that maps the parameter value $c$ to a scalar
|
||||
error value. For a given data set we thus can simply plot the cost
|
||||
function as a function of $c$ (\figref{cubiccostfig}).
|
||||
|
||||
\begin{exercise}{errorSurface.m}{}\label{errorsurfaceexercise}
|
||||
Then calculate the mean squared error between the data and straight
|
||||
lines for a range of slopes and intercepts using the
|
||||
\begin{exercise}{plotcubiccosts.m}{}
|
||||
Calculate the mean squared error between the data and the cubic
|
||||
function for a range of values of the factor $c$ using the
|
||||
\varcode{meanSquaredErrorCubic()} function from the previous
|
||||
exercise. Illustrate the error surface using the \code{surface()}
|
||||
function.
|
||||
exercise. Plot the graph of the cost function.
|
||||
\end{exercise}
|
||||
|
||||
\begin{figure}[t]
|
||||
@ -177,20 +176,22 @@ function as a function of $c$ (\figref{cubiccostfig}).
|
||||
|
||||
By looking at the plot of the cost function we can visually identify
|
||||
the position of the minimum and thus estimate the optimal value for
|
||||
the parameter $c$. How can we use the error surface to guide an
|
||||
the parameter $c$. How can we use the cost function to guide an
|
||||
automatic optimization process?
|
||||
|
||||
The obvious approach would be to calculate the error surface for any
|
||||
combination of slope and intercept values and then find the position
|
||||
of the minimum using the \code{min} function. This approach, however
|
||||
has several disadvantages: (i) it is computationally very expensive to
|
||||
calculate the error for each parameter combination. The number of
|
||||
combinations increases exponentially with the number of free
|
||||
parameters (also known as the ``curse of dimensionality''). (ii) the
|
||||
accuracy with which the best parameters can be estimated is limited by
|
||||
the resolution used to sample the parameter space. The coarser the
|
||||
parameters are sampled the less precise is the obtained position of
|
||||
the minimum.
|
||||
The obvious approach would be to calculate the mean squared error for
|
||||
a range of parameter values and then find the position of the minimum
|
||||
using the \code{min} function. This approach, however has several
|
||||
disadvantages: (i) the accuracy of the estimation of the best
|
||||
parameter is limited by the resolution used to sample the parameter
|
||||
space. The coarser the parameters are sampled the less precise is the
|
||||
obtained position of the minimum (\figref{cubiccostfig}, right). (ii)
|
||||
the range of parameter values might not include the absolute minimum.
|
||||
(iii) in particular for functions with more than a single free
|
||||
parameter it is computationally expensive to calculate the cost
|
||||
function for each parameter combination. The number of combinations
|
||||
increases exponentially with the number of free parameters. This is
|
||||
known as the \enterm{curse of dimensionality}.
|
||||
|
||||
So we need a different approach. We want a procedure that finds the
|
||||
minimum of the cost function with a minimal number of computations and
|
||||
@ -206,22 +207,22 @@ to arbitrary precision.
|
||||
m = \frac{f(x + \Delta x) - f(x)}{\Delta x}
|
||||
\end{equation}
|
||||
of a function $y = f(x)$ is the slope of the secant (red) defined
|
||||
by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ with the
|
||||
by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ at
|
||||
distance $\Delta x$.
|
||||
|
||||
The slope of the function $y=f(x)$ at the position $x$ (yellow) is
|
||||
The slope of the function $y=f(x)$ at the position $x$ (orange) is
|
||||
given by the derivative $f'(x)$ of the function at that position.
|
||||
It is defined by the difference quotient in the limit of
|
||||
infinitesimally (orange) small distances $\Delta x$:
|
||||
infinitesimally (red and yellow) small distances $\Delta x$:
|
||||
\begin{equation}
|
||||
\label{derivative}
|
||||
f'(x) = \frac{{\rm d} f(x)}{{\rm d}x} = \lim\limits_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x} \end{equation}
|
||||
\end{minipage}\vspace{2ex}
|
||||
|
||||
It is not possible to calculate the derivative, \eqnref{derivative},
|
||||
numerically. The derivative can only be estimated using the
|
||||
difference quotient, \eqnref{difffrac}, by using sufficiently small
|
||||
$\Delta x$.
|
||||
It is not possible to calculate the exact value of the derivative,
|
||||
\eqnref{derivative}, numerically. The derivative can only be
|
||||
estimated by computing the difference quotient, \eqnref{difffrac}
|
||||
using sufficiently small $\Delta x$.
|
||||
\end{ibox}
|
||||
|
||||
\begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}
|
||||
|
Reference in New Issue
Block a user