[regression] text and example for cost function figure

2020-12-18 13:10:53 +01:00 · 2020-12-18 13:10:53 +01:00 · fefe1c3726
commit fefe1c3726
parent 99a8a9d91e
2 changed files with 34 additions and 24 deletions
--- a/regression/code/plotcubiccosts.m
+++ b/regression/code/plotcubiccosts.m
@ -0,0 +1,9 @@
+cs = 2.0:0.1:8.0;
+mses = zeros(length(cs));
+for i = 1:length(cs)
+    mses(i) = meanSquaredErrorCubic(x, y, cs(i));
+end
+
+plot(cs, mses)
+xlabel('c')
+ylabel('mean squared error')
--- a/regression/lecture/regression.tex
+++ b/regression/lecture/regression.tex
@ -154,12 +154,11 @@ function $f_{cost}(c)$ that maps the parameter value $c$ to a scalar
 error value. For a given data set we thus can simply plot the cost
 function as a function of $c$ (\figref{cubiccostfig}).

-\begin{exercise}{errorSurface.m}{}\label{errorsurfaceexercise}
-  Then calculate the mean squared error between the data and straight
-  lines for a range of slopes and intercepts using the
+\begin{exercise}{plotcubiccosts.m}{}
+  Calculate the mean squared error between the data and the cubic
+  function for a range of values of the factor $c$ using the
  \varcode{meanSquaredErrorCubic()} function from the previous
-  exercise.  Illustrate the error surface using the \code{surface()}
-  function.
+  exercise. Plot the graph of the cost function.
 \end{exercise}

 \begin{figure}[t]
@ -177,20 +176,22 @@ function as a function of $c$ (\figref{cubiccostfig}).

 By looking at the plot of the cost function we can visually identify
 the position of the minimum and thus estimate the optimal value for
-the parameter $c$. How can we use the error surface to guide an
+the parameter $c$. How can we use the cost function to guide an
 automatic optimization process?

-The obvious approach would be to calculate the error surface for any
-combination of slope and intercept values and then find the position
-of the minimum using the \code{min} function. This approach, however
-has several disadvantages: (i) it is computationally very expensive to
-calculate the error for each parameter combination. The number of
-combinations increases exponentially with the number of free
-parameters (also known as the ``curse of dimensionality''). (ii) the
-accuracy with which the best parameters can be estimated is limited by
-the resolution used to sample the parameter space. The coarser the
-parameters are sampled the less precise is the obtained position of
-the minimum.
+The obvious approach would be to calculate the mean squared error for
+a range of parameter values and then find the position of the minimum
+using the \code{min} function. This approach, however has several
+disadvantages: (i) the accuracy of the estimation of the best
+parameter is limited by the resolution used to sample the parameter
+space. The coarser the parameters are sampled the less precise is the
+obtained position of the minimum (\figref{cubiccostfig}, right).  (ii)
+the range of parameter values might not include the absolute minimum.
+(iii) in particular for functions with more than a single free
+parameter it is computationally expensive to calculate the cost
+function for each parameter combination. The number of combinations
+increases exponentially with the number of free parameters. This is
+known as the \enterm{curse of dimensionality}.

 So we need a different approach. We want a procedure that finds the
 minimum of the cost function with a minimal number of computations and
@ -206,22 +207,22 @@ to arbitrary precision.
      m = \frac{f(x + \Delta x) - f(x)}{\Delta x}
    \end{equation}
    of a function $y = f(x)$ is the slope of the secant (red) defined
-    by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ with the
+    by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ at
    distance $\Delta x$.

-    The slope of the function $y=f(x)$ at the position $x$ (yellow) is
+    The slope of the function $y=f(x)$ at the position $x$ (orange) is
    given by the derivative $f'(x)$ of the function at that position.
    It is defined by the difference quotient in the limit of
-    infinitesimally (orange) small distances $\Delta x$:
+    infinitesimally (red and yellow) small distances $\Delta x$:
    \begin{equation}
      \label{derivative}
      f'(x) = \frac{{\rm d} f(x)}{{\rm d}x} = \lim\limits_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x} \end{equation}
  \end{minipage}\vspace{2ex} 

-  It is not possible to calculate the derivative, \eqnref{derivative},
-  numerically. The derivative can only be estimated using the
-  difference quotient, \eqnref{difffrac}, by using sufficiently small
-  $\Delta x$.
+  It is not possible to calculate the exact value of the derivative,
+  \eqnref{derivative}, numerically. The derivative can only be
+  estimated by computing the difference quotient, \eqnref{difffrac}
+  using sufficiently small $\Delta x$.
 \end{ibox}

 \begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}