diff --git a/regression/code/plotcubiccosts.m b/regression/code/plotcubiccosts.m new file mode 100644 index 0000000..6c90f03 --- /dev/null +++ b/regression/code/plotcubiccosts.m @@ -0,0 +1,9 @@ +cs = 2.0:0.1:8.0; +mses = zeros(length(cs)); +for i = 1:length(cs) + mses(i) = meanSquaredErrorCubic(x, y, cs(i)); +end + +plot(cs, mses) +xlabel('c') +ylabel('mean squared error') diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex index 2990aef..d903a0b 100644 --- a/regression/lecture/regression.tex +++ b/regression/lecture/regression.tex @@ -154,12 +154,11 @@ function $f_{cost}(c)$ that maps the parameter value $c$ to a scalar error value. For a given data set we thus can simply plot the cost function as a function of $c$ (\figref{cubiccostfig}). -\begin{exercise}{errorSurface.m}{}\label{errorsurfaceexercise} - Then calculate the mean squared error between the data and straight - lines for a range of slopes and intercepts using the +\begin{exercise}{plotcubiccosts.m}{} + Calculate the mean squared error between the data and the cubic + function for a range of values of the factor $c$ using the \varcode{meanSquaredErrorCubic()} function from the previous - exercise. Illustrate the error surface using the \code{surface()} - function. + exercise. Plot the graph of the cost function. \end{exercise} \begin{figure}[t] @@ -177,20 +176,22 @@ function as a function of $c$ (\figref{cubiccostfig}). By looking at the plot of the cost function we can visually identify the position of the minimum and thus estimate the optimal value for -the parameter $c$. How can we use the error surface to guide an +the parameter $c$. How can we use the cost function to guide an automatic optimization process? -The obvious approach would be to calculate the error surface for any -combination of slope and intercept values and then find the position -of the minimum using the \code{min} function. This approach, however -has several disadvantages: (i) it is computationally very expensive to -calculate the error for each parameter combination. The number of -combinations increases exponentially with the number of free -parameters (also known as the ``curse of dimensionality''). (ii) the -accuracy with which the best parameters can be estimated is limited by -the resolution used to sample the parameter space. The coarser the -parameters are sampled the less precise is the obtained position of -the minimum. +The obvious approach would be to calculate the mean squared error for +a range of parameter values and then find the position of the minimum +using the \code{min} function. This approach, however has several +disadvantages: (i) the accuracy of the estimation of the best +parameter is limited by the resolution used to sample the parameter +space. The coarser the parameters are sampled the less precise is the +obtained position of the minimum (\figref{cubiccostfig}, right). (ii) +the range of parameter values might not include the absolute minimum. +(iii) in particular for functions with more than a single free +parameter it is computationally expensive to calculate the cost +function for each parameter combination. The number of combinations +increases exponentially with the number of free parameters. This is +known as the \enterm{curse of dimensionality}. So we need a different approach. We want a procedure that finds the minimum of the cost function with a minimal number of computations and @@ -206,22 +207,22 @@ to arbitrary precision. m = \frac{f(x + \Delta x) - f(x)}{\Delta x} \end{equation} of a function $y = f(x)$ is the slope of the secant (red) defined - by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ with the + by the points $(x,f(x))$ and $(x+\Delta x,f(x+\Delta x))$ at distance $\Delta x$. - The slope of the function $y=f(x)$ at the position $x$ (yellow) is + The slope of the function $y=f(x)$ at the position $x$ (orange) is given by the derivative $f'(x)$ of the function at that position. It is defined by the difference quotient in the limit of - infinitesimally (orange) small distances $\Delta x$: + infinitesimally (red and yellow) small distances $\Delta x$: \begin{equation} \label{derivative} f'(x) = \frac{{\rm d} f(x)}{{\rm d}x} = \lim\limits_{\Delta x \to 0} \frac{f(x + \Delta x) - f(x)}{\Delta x} \end{equation} \end{minipage}\vspace{2ex} - It is not possible to calculate the derivative, \eqnref{derivative}, - numerically. The derivative can only be estimated using the - difference quotient, \eqnref{difffrac}, by using sufficiently small - $\Delta x$. + It is not possible to calculate the exact value of the derivative, + \eqnref{derivative}, numerically. The derivative can only be + estimated by computing the difference quotient, \eqnref{difffrac} + using sufficiently small $\Delta x$. \end{ibox} \begin{ibox}[tp]{\label{partialderivativebox}Partial derivative and gradient}