fixed many index entries

2019-12-09 20:01:27 +01:00
parent f24c14e6f5
commit bf52536b7b
12 changed files with 332 additions and 306 deletions
--- a/regression/lecture/regression.tex
+++ b/regression/lecture/regression.tex
@@ -33,7 +33,7 @@ fitting approaches. We will apply this method to find the combination
 of slope and intercept that best describes the system.


-\section{The error function --- mean square error}
+\section{The error function --- mean squared error}

 Before the optimization can be done we need to specify what is
 considered an optimal fit. In our example we search the parameter
@@ -57,25 +57,23 @@ $\sum_{i=1}^N |y_i - y^{est}_i|$. The total error can only be small if
 all deviations are indeed small no matter if they are above or below
 the prediced line. Instead of the sum we could also ask for the
 \emph{average}
-
 \begin{equation}
  \label{meanabserror}
  f_{dist}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N |y_i - y^{est}_i|
 \end{equation}
 should be small. Commonly, the \enterm{mean squared distance} oder
-\enterm{mean squared error}
+\enterm[square error!mean]{mean square error} (\determ[quadratischer Fehler!mittlerer]{mittlerer quadratischer Fehler})
 \begin{equation}
  \label{meansquarederror}
  f_{mse}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N (y_i - y^{est}_i)^2
 \end{equation}
-
 is used (\figref{leastsquareerrorfig}). Similar to the absolute
 distance, the square of the error($(y_i - y_i^{est})^2$) is always
 positive error values do not cancel out. The square further punishes
 large deviations.

 \begin{exercise}{meanSquareError.m}{}\label{mseexercise}%
-  Implement a function \code{meanSquareError()}, that calculates the
+  Implement a function \varcode{meanSquareError()}, that calculates the
  \emph{mean square distance} between a vector of observations ($y$)
  and respective predictions ($y^{est}$).
 \end{exercise}
@@ -84,18 +82,19 @@ large deviations.
 \section{\tr{Objective function}{Zielfunktion}}

 $f_{cost}(\{(x_i, y_i)\}|\{y^{est}_i\})$ is a so called
-\enterm{objective function} or \enterm{cost function}. We aim to adapt
-the model parameters to minimize the error (mean square error) and
-thus the \emph{objective function}. In Chapter~\ref{maximumlikelihoodchapter}
-we will show that the minimization of the mean square error is
-equivalent to maximizing the likelihood that the observations
-originate from the model (assuming a normal distribution of the data
-around the model prediction).
+\enterm{objective function} or \enterm{cost function}
+(\determ{Kostenfunktion}). We aim to adapt the model parameters to
+minimize the error (mean square error) and thus the \emph{objective
+  function}. In Chapter~\ref{maximumlikelihoodchapter} we will show
+that the minimization of the mean square error is equivalent to
+maximizing the likelihood that the observations originate from the
+model (assuming a normal distribution of the data around the model
+prediction).

 \begin{figure}[t]
  \includegraphics[width=1\textwidth]{linear_least_squares}
  \titlecaption{Estimating the \emph{mean square error}.}  {The
-    deviation (\enterm{error}, orange) between the prediction (red
+    deviation error, orange) between the prediction (red
    line) and the observations (blue dots) is calculated for each data
    point (left). Then the deviations are squared and the aveage is
    calculated (right).}
@@ -119,11 +118,13 @@ Replacing $y^{est}$ with the linear equation (the model) in

 That is, the mean square error is given the pairs $(x_i, y_i)$ and the
 parameters $m$ and $b$ of the linear equation. The optimization
-process will not try to optimize $m$ and $b$ to lead to the smallest
-error, the method of the \enterm{least square error}.
+process tries to optimize $m$ and $b$ such that the error is
+minimized, the method of the \enterm[square error!least]{least square
+  error} (\determ[quadratischer Fehler!kleinster]{Methode der
+  kleinsten Quadrate}).

 \begin{exercise}{lsqError.m}{}
-  Implement the objective function \code{lsqError()} that applies the
+  Implement the objective function \varcode{lsqError()} that applies the
  linear equation as a model.
  \begin{itemize}
  \item The function takes three arguments. The first is a 2-element
@@ -131,7 +132,7 @@ error, the method of the \enterm{least square error}.
    \varcode{b}. The second is a vector of x-values the third contains
    the measurements for each value of $x$, the respecive $y$-values.
  \item The function returns the mean square error \eqnref{mseline}.
-  \item The function should call the function \code{meanSquareError()}
+  \item The function should call the function \varcode{meanSquareError()}
    defined in the previouos exercise to calculate the error.
  \end{itemize}
 \end{exercise}
@@ -165,7 +166,7 @@ third dimension is used to indicate the error value
  \varcode{y}). Implement a script \file{errorSurface.m}, that
  calculates the mean square error between data and a linear model and
  illustrates the error surface using the \code{surf()} function
-  (consult the help to find out how to use \code{surf}.).
+  (consult the help to find out how to use \code{surf()}.).
 \end{exercise}

 By looking at the error surface we can directly see the position of
@@ -257,7 +258,7 @@ way to the minimum of the objective function. The ball will always
 follow the steepest slope. Thus we need to figure out the direction of
 the steepest slope at the position of the ball.

-The \enterm{gradient} (Box~\ref{partialderivativebox}) of the
+The \entermde{Gradient}{gradient} (Box~\ref{partialderivativebox}) of the
 objective function is the vector

 \[ \nabla f_{cost}(m,b) = \left( \frac{\partial f(m,b)}{\partial m},
@@ -296,7 +297,7 @@ choose the opposite direction.
 \end{figure}

 \begin{exercise}{lsqGradient.m}{}\label{gradientexercise}%
-  Implement a function \code{lsqGradient()}, that takes the set of
+  Implement a function \varcode{lsqGradient()}, that takes the set of
  parameters $(m, b)$ of the linear equation as a two-element vector
  and the $x$- and $y$-data as input arguments. The function should
  return the gradient at that position.
@@ -316,8 +317,8 @@ choose the opposite direction.
 Finally, we are able to implement the optimization itself. By now it
 should be obvious why it is called the gradient descent method. All
 ingredients are already there. We need: 1. The error function
-(\code{meanSquareError}), 2. the objective function
-(\code{lsqError()}), and 3. the gradient (\code{lsqGradient()}). The
+(\varcode{meanSquareError}), 2. the objective function
+(\varcode{lsqError()}), and 3. the gradient (\varcode{lsqGradient()}). The
 algorithm of the gradient descent is:

 \begin{enumerate}