diff --git a/Makefile b/Makefile
index a5054d5..a743d0a 100644
--- a/Makefile
+++ b/Makefile
@@ -22,9 +22,10 @@ $(BASENAME).pdf : $(BASENAME).tex header.tex $(SUBTEXS)
 	splitindex $(BASENAME).idx
 
 index :
-	pdflatex -interaction=scrollmode $(BASENAME).tex
+	pdflatex $(BASENAME).tex
 	splitindex $(BASENAME).idx
-	pdflatex -interaction=scrollmode $(BASENAME).tex | tee /dev/stderr | fgrep -q "Rerun to get cross-references right" && pdflatex $(BASENAME).tex || true
+	pdflatex $(BASENAME).tex
+	pdflatex $(BASENAME).tex
 
 again :
 	pdflatex $(BASENAME).tex
diff --git a/bootstrap/lecture/bootstrap.tex b/bootstrap/lecture/bootstrap.tex
index f23f28d..b01b227 100644
--- a/bootstrap/lecture/bootstrap.tex
+++ b/bootstrap/lecture/bootstrap.tex
@@ -171,7 +171,7 @@ A good example for the application of a
 assessment of \entermde[correlation]{Korrelation}{correlations}. Given
 are measured pairs of data points $(x_i, y_i)$. By calculating the
 \entermde[correlation!correlation
-coefficient]{Korrelation!Korrelationskoeffizient}{correlation
+coefficient]{Korrelation!-skoeffizient}{correlation
   coefficient} we can quantify how strongly $y$ depends on $x$. The
 correlation coefficient alone, however, does not tell whether the
 correlation is significantly different from a random correlation. The
diff --git a/codestyle/lecture/codestyle.tex b/codestyle/lecture/codestyle.tex
index 9811d2f..73d5817 100644
--- a/codestyle/lecture/codestyle.tex
+++ b/codestyle/lecture/codestyle.tex
@@ -1,4 +1,4 @@
-\chapter{\tr{Code style}{Programmierstil}}
+\chapter{Code style}
 
 \shortquote{Any code of your own that you haven't looked at for six or
   more months might as well have been written by someone
@@ -33,7 +33,7 @@ by calling functions that work on the data and managing the
 results. Applying this structure makes it easy to understand the flow
 of the program but two questions remain: (i) How to organize the files
 on the file system and (ii) how to name them that the controlling
-script is easily identified among the other \codeterm{m-files}.
+script is easily identified among the other \codeterm[m-file]{m-files}.
 
 Upon installation \matlab{} creates a folder called \file{MATLAB} in
 the user space (Windows: My files, Linux: Documents, MacOS:
@@ -43,7 +43,7 @@ moment. Of course, any other location can specified as well. Generally
 it is of great advantage to store related scripts and functions within
 the same folder on the hard drive. An easy approach is to create a
 project-specific folder structure that contains sub-folders for each
-task (analysis) and to store all related \codeterm{m-files}
+task (analysis) and to store all related \codeterm[m-file]{m-files}
 (screenshot \ref{fileorganizationfig}). In these task-related folders
 one may consider to create a further sub-folder to store results
 (created figures, result data). On the project level a single script
@@ -307,8 +307,8 @@ and to briefly explain what they do. Whenever one feels tempted to do
 this, one could also consider to delegate the respective task to a
 function. In most cases this is preferable.
 
-Not delegating the tasks leads to very long \codeterm{m-files} which
-can be confusing. Sometimes such a code is called ``spaghetti
+Not delegating the tasks leads to very long \codeterm[m-file]{m-files}
+which can be confusing. Sometimes such a code is called ``spaghetti
 code''. It is high time to think about delegation of tasks to
 functions.
 
@@ -323,17 +323,17 @@ functions.
 \end{important}
 
 \subsection{Local and nested functions}
-Generally, functions live in their own \codeterm{m-files} that have
-the same name as the function itself. Delegating tasks to functions
-thus leads to a large set of \codeterm{m-files} which increases
-complexity and may lead to confusion. If the delegated functionality
-is used in multiple instances, it is still advisable to do so. On the
-other hand, when the delegated functionality is only used within the
-context of another function \matlab{} allows to define
-\codeterm[function!local]{local functions} and
-\codeterm[function!nested]{nested functions} within the same
-file. Listing \ref{localfunctions} shows an example of a local
-function definition.
+Generally, functions live in their own \codeterm[m-file]{m-files} that
+have the same name as the function itself. Delegating tasks to
+functions thus leads to a large set of \codeterm[m-file]{m-files}
+which increases complexity and may lead to confusion. If the delegated
+functionality is used in multiple instances, it is still advisable to
+do so. On the other hand, when the delegated functionality is only
+used within the context of another function \matlab{} allows to define
+\entermde[function!local]{Funktion!lokale}{local functions} and
+\entermde[function!nested]{Funktion!verschachtelte}{nested functions}
+within the same file. Listing \ref{localfunctions} shows an example of
+a local function definition.
 
 \pagebreak[3] \lstinputlisting[label=localfunctions, caption={Example
     for local functions.}]{calculateSines.m}
@@ -408,11 +408,12 @@ advisable to adhere to these.
 Repeated tasks should (to be read as must) be delegated to
 functions. In cases in which a function is only locally applied and
 not of more global interest across projects consider to define it as
-\codeterm[function!local]{local function} or
-\codeterm[function!nested]{nested function}. Taking care to increase
-readability and comprehensibility pays off, even to the author!
-\footnote{Reading tip: Robert C. Martin: \textit{Clean Code: A Handbook of
-    Agile Software Craftmanship}, Prentice Hall}
+\entermde[function!local]{Funktion!lokale}{local function} or
+\entermde[function!nested]{Funktion!verschachtelte}{nested
+  function}. Taking care to increase readability and comprehensibility
+pays off, even to the author!  \footnote{Reading tip: Robert
+  C. Martin: \textit{Clean Code: A Handbook of Agile Software
+    Craftmanship}, Prentice Hall}
 
 \shortquote{Programs must be written for people to read, and only
   incidentally for machines to execute.}{Abelson / Sussman}
diff --git a/debugging/lecture/debugging.tex b/debugging/lecture/debugging.tex
index e7d213e..6c70fab 100644
--- a/debugging/lecture/debugging.tex
+++ b/debugging/lecture/debugging.tex
@@ -49,13 +49,14 @@ obscure logical errors! Take care when using the \codeterm{try-catch
 \end{important}
 
 
-\subsection{\codeterm{Syntax errors}}\label{syntax_error}
-The most common and easiest to fix type of error. A syntax error
-violates the rules (spelling and grammar) of the programming
-language. For example every opening parenthesis must be matched by a
-closing one or every \code{for} loop has to be closed by an
-\code{end}. Usually, the respective error messages are clear and
-the editor will point out and highlight most \codeterm{syntax error}s.
+\subsection{Syntax errors}\label{syntax_error}
+The most common and easiest to fix type of error. A
+\entermde[error!syntax]{Fehler!Syntax\~}{syntax error} violates the
+rules (spelling and grammar) of the programming language. For example
+every opening parenthesis must be matched by a closing one or every
+\code{for} loop has to be closed by an \code{end}. Usually, the
+respective error messages are clear and the editor will point out and
+highlight most syntax errors.
 
 \begin{lstlisting}[label=syntaxerror, caption={Unbalanced parenthesis error.}]
 >> mean(random_numbers
@@ -66,8 +67,9 @@ Did you mean:
 >>   mean(random_numbers)
 \end{lstlisting}
 
-\subsection{\codeterm{Indexing error}}\label{index_error}
-Second on the list of common errors are the indexing errors. Usually
+\subsection{Indexing error}\label{index_error}
+Second on the list of common errors are the
+\entermde[error!indexing]{Fehler!Index\~}{indexing errors}. Usually
 \matlab{} gives rather precise infromation about the cause, once you
 know what they mean. Consider the following code.
 
@@ -111,14 +113,16 @@ to a number and uses this number to address the element in
 \varcode{my\_array}. The \codeterm{char} has the ASCII code 65 and
 thus the 65th element of \varcode{my\_array} is returned.
 
-\subsection{\codeterm{Assignment error}}
-Related to the Indexing error, an assignment error occurs when we want
-to write data into a variable, that does not fit into it. Listing
-\ref{assignmenterror} shows the simple case for 1-d data but, of
-course, it extents to n-dimensional data. The data that is to be
-filled into a matrix hat to fit in all dimensions. The command in line
-7 works due to the fact, that matlab automatically extends the matrix,
-if you assign values to a range outside its bounds.
+\subsection{Assignment error}
+Related to the indexing error, an
+\entermde[error!assignment]{Fehler!Zuweisungs\~}{assignment error}
+occurs when we want to write data into a variable, that does not fit
+into it. Listing \ref{assignmenterror} shows the simple case for 1-d
+data but, of course, it extents to n-dimensional data. The data that
+is to be filled into a matrix hat to fit in all dimensions. The
+command in line 7 works due to the fact, that matlab automatically
+extends the matrix, if you assign values to a range outside its
+bounds.
 
 \begin{lstlisting}[label=assignmenterror, caption={Assignment errors.}]
 >> a = zeros(1, 100);
@@ -133,21 +137,20 @@ ans =
      110    1
 \end{lstlisting}
 
-\subsection{\codeterm{Dimension mismatch error}}
+\subsection{Dimension mismatch error}
 Similarly, some arithmetic operations are only valid if the variables
 fulfill some size constraints. Consider the following commands
 (listing\,\ref{dimensionmismatch}). The first one (line 3) fails
-because we are trying to do al elementwise add on two vectors that
-have different lengths, respectively sizes. The matrix multiplication
-in line 6 also fails since for this operations to succeed the inner
-matrix dimensions must agree (for more information on the
-matrixmultiplication see box\,\ref{matrixmultiplication} in
-chapter\,\ref{programming}). The elementwise multiplication issued in
-line 10 fails for the same reason as the addition we tried
-earlier. Sometimes, however, things apparently work but the result may
-be surprising. The last operation in listing\,\ref{dimensionmismatch}
-does not throw an error but the result is something else than the
-expected elementwise multiplication.
+because we are trying to add two vectors of different lengths
+elementwise. The matrix multiplication in line 6 also fails since for
+this operations to succeed the inner matrix dimensions must agree (for
+more information on the matrixmultiplication see
+box\,\ref{matrixmultiplication} in chapter\,\ref{programming}). The
+elementwise multiplication issued in line 10 fails for the same reason
+as the addition we tried earlier. Sometimes, however, things
+apparently work but the result may be surprising. The last operation
+in listing\,\ref{dimensionmismatch} does not throw an error but the
+result is something else than the expected elementwise multiplication.
 
 % XXX Some arithmetic operations make size constraints, violating them leads to dimension mismatch errors.
 \begin{lstlisting}[label=dimensionmismatch, caption={Dimension mismatch errors.}]
@@ -174,7 +177,8 @@ expected elementwise multiplication.
 \section{Logical error}
 Sometimes a program runs smoothly and terminates without any
 complaint. This, however, does not necessarily mean that the program
-is correct. We may have made a \codeterm{logical error}. Logical
+is correct. We may have made a
+\entermde[error!logical]{Fehler!logischer}{logical error}. Logical
 errors are hard to find, \matlab{} has no chance to detect such errors
 since they do not violate the syntax or cause the throwing of an
 error. Thus, we are on our own to find and fix the bug. There are a
@@ -283,7 +287,7 @@ validity.
 Matlab offers a unit testing framework in which small scripts are
 written that test the features of the program. We will follow the
 example given in the \matlab{} help and assume that there is a
-function \code{rightTriangle} (listing\,\ref{trianglelisting}).
+function \varcode{rightTriangle()} (listing\,\ref{trianglelisting}).
 
 % XXX Slightly more readable version of the example given in the \matlab{} help system. Note: The variable name for the angles have been capitalized in order to not override the matlab defined functions \code{alpha, beta,} and \code{gamma}.
 \begin{lstlisting}[label=trianglelisting, caption={Example function for unit testing.}]
@@ -308,7 +312,7 @@ folder that follows the following rules.
 \item The name of the script file must start or end with the word
   'test', which is case-insensitive.
 \item Each unit test should be placed in a separate section/cell of the script.
-\item After the \code{\%\%} that defines the cell, a name for the
+\item After the \mcode{\%\%} that defines the cell, a name for the
   particular unit test may be given.
 \end{enumerate}
 
@@ -328,11 +332,11 @@ Further there are a few things that are different in tests compared to normal sc
   tests.
 \end{enumerate}
 
-The test script for the \code{rightTrianlge} function
+The test script for the \varcode{rightTriangle()} function
 (listing\,\ref{trianglelisting}) may look like in
 listing\,\ref{testscript}.
 
-\begin{lstlisting}[label=testscript, caption={Unit test for the \code{rightTriangle} function stored in an m-file testRightTriangle.m}]
+\begin{lstlisting}[label=testscript, caption={Unit test for the \varcode{rightTriangle()} function stored in an m-file testRightTriangle.m}]
 tolerance = 1e-10;
 
 % preconditions
@@ -372,7 +376,7 @@ assert(abs(approx - smallAngle) <= tolerance, 'Problem with small angle approxim
 
 In a test script we can execute any code. The actual test whether or
 not the results match our predictions is done using the
-\code{assert()}{assert} function. This function basically expects a
+\code{assert()} function. This function basically expects a
 boolean value and if this is not true, it raises an error that, in the
 context of the test does not lead to a termination of the program. In
 the tests above, the argument to assert is always a boolean expression
@@ -392,7 +396,7 @@ result = runtests('testRightTriangle')
 During the run, \matlab{} will put out error messages onto the command
 line and a summary of the test results is then stored within the
 \varcode{result} variable. These can be displayed using the function
-\code{table(result)}
+\code[table()]{table(result)}.
 
 \begin{lstlisting}[label=testresults, caption={The test results.}, basicstyle=\ttfamily\scriptsize]
 table(result)
@@ -431,7 +435,7 @@ that help to solve the problem.
 \item No idea what the error message is trying to say? Google it!
 \item Read the program line by line and understand what each line is
   doing.
-\item Use \code{disp} to print out relevant information on the command
+\item Use \code{disp()} to print out relevant information on the command
   line and compare the output with your expectations.  Do this step by
   step and start at the beginning.
 \item Use the \matlab{} debugger to stop execution of the code at a
diff --git a/header.tex b/header.tex
index d626757..d33c7ac 100644
--- a/header.tex
+++ b/header.tex
@@ -217,7 +217,7 @@
 % the english index.
 \newcommand{\enterm}[2][]{\textit{#2}\ifthenelse{\equal{#1}{}}{\protect\sindex[enterm]{#2}}{\protect\sindex[enterm]{#1}}}
 
-% \endeterm[english index entry]{<german index entry>}{<english term>}
+% \entermde[english index entry]{<german index entry>}{<english term>}
 % typeset the english term in italics and add it (or the first
 % optional argument) to the english index. In addition add the german
 % index entry to the german index without printing it.
@@ -270,7 +270,7 @@
 \newcommand{\pythonfun}[1]{(\tr{\python-function}{\python-Funktion} \varcode{#1})\protect\sindex[pcode]{#1}}
 
 % typeset '(matlab-function #1)' and add the function to the matlab index:
-\newcommand{\matlabfun}[1]{(\tr{\matlab-function}{\matlab-Funktion} \varcode{#1})\protect\sindex[mcode]{#1}}
+\newcommand{\matlabfun}[1]{(function \varcode{#1})\protect\sindex[mcode]{#1}}
 
 
 %%%%% shortquote and widequote commands: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
diff --git a/likelihood/lecture/likelihood.tex b/likelihood/lecture/likelihood.tex
index 1994a8b..dcaafaa 100644
--- a/likelihood/lecture/likelihood.tex
+++ b/likelihood/lecture/likelihood.tex
@@ -26,15 +26,16 @@ parameters $\theta$. This could be the normal distribution
 defined by the mean $\mu$ and the standard deviation $\sigma$ as
 parameters $\theta$.  If the $n$ independent observations of $x_1,
 x_2, \ldots x_n$ originate from the same probability density
-distribution (they are \enterm{i.i.d.} independent and identically
-distributed) then the conditional probability $p(x_1,x_2, \ldots
+distribution (they are \enterm[i.i.d.|see{independent and identically
+distributed}]{i.i.d.}, \enterm{independent and identically
+distributed}) then the conditional probability $p(x_1,x_2, \ldots
 x_n|\theta)$ of observing $x_1, x_2, \ldots x_n$ given a specific
 $\theta$ is given by
 \begin{equation}
   p(x_1,x_2, \ldots x_n|\theta) = p(x_1|\theta) \cdot p(x_2|\theta)
   \ldots p(x_n|\theta) = \prod_{i=1}^n p(x_i|\theta) \; .
 \end{equation}
-Vice versa, the \enterm{likelihood} of the parameters $\theta$
+Vice versa, the \entermde{Likelihood}{likelihood} of the parameters $\theta$
 given the observed data $x_1, x_2, \ldots x_n$ is
 \begin{equation}
   {\cal L}(\theta|x_1,x_2, \ldots x_n) = p(x_1,x_2, \ldots x_n|\theta) \; .
@@ -57,7 +58,7 @@ The position of a function's maximum does not change when the values
 of the function are transformed by a strictly monotonously rising
 function such as the logarithm. For numerical and reasons that we will
 discuss below, we commonly search for the maximum of the logarithm of
-the likelihood (\enterm{log-likelihood}):
+the likelihood (\entermde[likelihood!log-]{Likelihood!Log-}{log-likelihood}):
 
 \begin{eqnarray}
   \theta_{mle} & = & \text{argmax}_{\theta}\; {\cal L}(\theta|x_1,x_2, \ldots x_n) \nonumber \\
@@ -136,9 +137,10 @@ from the data.
 For non-Gaussian distributions (e.g. a Gamma-distribution), however,
 such simple analytical expressions for the parameters of the
 distribution do not exist, e.g. the shape parameter of a
-\enterm{Gamma-distribution}. How do we fit such a distribution to
-some data? That is, how should we compute the values of the parameters
-of the distribution, given the data?
+\entermde[distribution!Gamma-]{Verteilung!Gamma-}{Gamma-distribution}. How
+do we fit such a distribution to some data? That is, how should we
+compute the values of the parameters of the distribution, given the
+data?
 
 A first guess could be to fit the probability density function by
 minimization of the squared difference to a histogram of the measured
@@ -289,10 +291,10 @@ out of \eqnref{mleslope} and we get
 To see what this expression is, we need to standardize the data. We
 make the data mean free and normalize them to their standard
 deviation, i.e. $x \mapsto (x - \bar x)/\sigma_x$. The resulting
-numbers are also called \enterm[z-values]{$z$-values} or $z$-scores and they
-have the property $\bar x = 0$ and $\sigma_x = 1$. $z$-scores are
-often used in Biology to make quantities that differ in their units
-comparable. For standardized data the variance
+numbers are also called \entermde[z-values]{z-Wert}{$z$-values} or
+$z$-scores and they have the property $\bar x = 0$ and $\sigma_x =
+1$. $z$-scores are often used in Biology to make quantities that
+differ in their units comparable. For standardized data the variance
 \[ \sigma_x^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2 = \frac{1}{n} \sum_{i=1}^n x_i^2 = 1 \]
 is given by the mean squared data and equals one.
 The covariance between $x$ and $y$ also simplifies to
diff --git a/plotting/lecture/plotting.tex b/plotting/lecture/plotting.tex
index e3d3280..ae531ba 100644
--- a/plotting/lecture/plotting.tex
+++ b/plotting/lecture/plotting.tex
@@ -112,16 +112,15 @@ the missing information ourselves. Thus, we need a second variable
 that contains the respective \varcode{x} values. The length of
 \varcode{x} and \varcode{y} must be the same otherwise the later call
 of the \varcode{plot} function will raise an error. The respective
-call will expand to \code[plot()]{plot(x, y)}. The x-axis will be now
-be scaled from the minimum in \varcode{x} to the maximum of
-\varcode{x} and by default it will be plotted as a line plot with a
-solid blue line of the linewidth 1pt. A second plot that is added to the
-figure will be plotted in red using the same settings. The
-order of the used colors depends on the \enterm{colormap} settings
-which can be adjusted to personal taste or
-need. Table\,\ref{plotlinestyles} shows some predefined values that
-can be chosen for the line style, the marker, or the color. For
-additional options consult the help.
+call will expand to \code[plot()]{plot(x, y)}. The x-axis will now be
+scaled from the minimum in \varcode{x} to the maximum of \varcode{x}
+and by default it will be plotted as a line plot with a solid blue
+line of the linewidth 1pt. A second plot that is added to the figure
+will be plotted in red using the same settings. The order of the used
+colors depends on the \enterm{colormap} settings which can be adjusted
+to personal taste or need. Table\,\ref{plotlinestyles} shows some
+predefined values that can be chosen for the line style, the marker,
+or the color. For additional options consult the help.
 
 \begin{table}[htp]
   \titlecaption{Predefined line styles (left), colors (center) and
@@ -184,8 +183,8 @@ chosen.
 \subsection{Changing the axes properties}
 
 The first thing a plot needs are axis labels with correct units. By
-calling the functions \code[xlabel]{xlabel('Time [ms]')} and
-\code[ylabel]{ylabel('Voltage [mV]')} these can be set. By default the
+calling the functions \code[xlabel()]{xlabel('Time [ms]')} and
+\code[ylabel()]{ylabel('Voltage [mV]')} these can be set. By default the
 axes will be scaled to show the full extent of the data. The extremes
 will be selected as the closest integer for small values or the next
 full multiple of tens, hundreds, thousands, etc.\ depending on the
@@ -196,8 +195,8 @@ functions expect a single argument, that is a 2-element vector
 containing the minimum and maximum value. Table\,\ref{plotaxisprops}
 lists some of the commonly adjusted properties of an axis. To set
 these properties, we need to have the axes object which can either be
-stored in a variable when calling \varcode{plot} (\code{axes =
-  plot(x,y);}) or can be retrieved using the \code[gca]{gca} function
+stored in a variable when calling \varcode{plot} (\varcode{axes =
+  plot(x,y);}) or can be retrieved using the \code{gca()} function
 (gca stands for ``get current axes''). Changing the properties of the axes
 object will update the plot (listing\,\ref{niceplotlisting}).
 
@@ -253,8 +252,8 @@ and the placement of the axes on the
 paper. Table\,\ref{plotfigureprops} lists commonly used
 properties. For a complete reference check the help. To change the
 figure's appearance, we need to change the properties of the figure
-object which can be retrieved during creation of the figure (\code{fig
-  = figure();}) or by using the \code{gcf} (``get current figure'')
+object which can be retrieved during creation of the figure (\code[figure()]{fig
+  = figure();}) or by using the \code{gcf()} (``get current figure'')
 command.
 
 The script shown in the listing\,\ref{niceplotlisting} exemplifies
@@ -334,10 +333,10 @@ the last one defines the output format (box\,\ref{graphicsformatbox}).
   properties could be read and set using the functions
   \code[get()]{get} and \code[set()]{set}. The first argument these
   functions expect are valid figure or axis \emph{handles} which were
-  returned by the \code{figure} and \code{plot} functions, or could be
-  retrieved using \code[gcf()]{gcf} or \code[gca()]{gca} for the
+  returned by the \code{figure()} and \code{plot()} functions, or could be
+  retrieved using \code{gcf()} or \code{gca()} for the
   current figure or axis handle, respectively. Subsequent arguments
-  passed to \code{set} are pairs of a property's name and the desired
+  passed to \code{set()} are pairs of a property's name and the desired
   value.
   \begin{lstlisting}[caption={Using set to change figure and axis properties.}]
     frequency = 5;          % frequency of the sine wave in Hz
@@ -351,8 +350,8 @@ the last one defines the output format (box\,\ref{graphicsformatbox}).
     set(figure_handle, 'PaperSize', [5.5, 5.5], 'PaperUnit', 'centimeters', ...
         'PaperPosition', [0, 0, 5.5, 5.5]);    
   \end{lstlisting}
-  With newer versions the handles returned by \varcode{gcf} and
-  \varcode{gca} are ``objects'' and setting properties became much
+  With newer versions the handles returned by \code{gcf()} and
+  \code{gca()} are ``objects'' and setting properties became much
   easier as it is used throughout this chapter.  For downward
   compatibility with older versions set and get still work in current
   versions of \matlab{}.
@@ -371,7 +370,7 @@ For some types of plots we present examples in the following sections.
 \subsection{Scatter}
 
 For displaying events or pairs of x-y coordinates the standard line
-plot is not optimal. Rather, we use \code[scatter()]{scatter} for this
+plot is not optimal. Rather, we use \code{scatter()} for this
 purpose. For example, we have a number of measurements of a system's
 response to a certain stimulus intensity. There is no dependency
 between the data points, drawing them with a line-plot would be
@@ -417,8 +416,8 @@ A very common scenario is to combine several plots in the same
 figure. To do this we create so-called subplots
 figures\,\ref{regularsubplotsfig},\,\ref{irregularsubplotsfig}. The
 \code[subplot()]{subplot()} command allows to place multiple axes onto
-a single sheet of paper. Generally, \varcode{subplot} expects three argument
-defining the number of rows, column, and the currently active
+a single sheet of paper. Generally, \code{subplot()} expects three
+argument defining the number of rows, column, and the currently active
 plot. The currently active plot number starts with 1 and goes up to
 $rows \cdot columns$ (numbers in the subplots in
 figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}).
@@ -439,7 +438,7 @@ figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}).
 By default, all subplots have the same size, if something else is
 desired, e.g.\ one subplot should span a whole row, while two others
 are smaller and should be placed side by side in the same row, the
-third argument of \varcode{subplot} can be a vector or numbers that
+third argument of \code{subplot()} can be a vector or numbers that
 should be joined. These have, of course, to be adjacent numbers
 (\figref{irregularsubplotsfig},
 listing\,\ref{irregularsubplotslisting}).
@@ -457,7 +456,7 @@ columns, need to be used in a plot. If you want to create something
 more elaborate, or have more spacing between the subplots one can
 create a grid with larger numbers of columns and rows, and specify the
 used cells of the grid by passing a vector as the third argument to
-\varcode{subplot}.
+\code{subplot()}.
 
 \lstinputlisting[caption={Script for creating subplots of different
     sizes \figref{irregularsubplotsfig}.},
@@ -498,12 +497,12 @@ more apt. Accordingly, four arguments are needed (line 12 in listing
 \ref{errorbarlisting}). The first two arguments are the same, the next
 to represent the positive and negative deflections.
 
-By default the \code{errorbar} function does not draw a marker. In the
+By default the \code{errorbar()} function does not draw a marker. In the
 examples shown here we provide extra arguments to define that a circle
 is used for that purpose. The line connecting the average values can
 be removed by passing additional arguments. The properties of the
 errorbars themselves (linestyle, linewidth, capsize, etc.) can be
-changed by taking the return argument of \code{errorbar} and changing
+changed by taking the return argument of \code{errorbar()} and changing
 its properties. See the \matlab{} help for more information.
 
 \begin{figure}[ht]
@@ -530,18 +529,18 @@ areas instead of errorbars: In case you have a lot of data points with
 respective errorbars such that they would merge in the figure it is
 cleaner and probably easier to read and handle if one uses an error
 area instead.  To achieve an illustration as shown in
-figure\,\ref{errorbarplot} C, we use the \code{fill} command in
+figure\,\ref{errorbarplot} C, we use the \code{fill()} command in
 combination with a standard line plot. The original purpose of
-\code{fill} is to draw a filled polygon. We hence have to provide it
+\code{fill()} is to draw a filled polygon. We hence have to provide it
 with the vertex points of the polygon. For each x-value we now have
 two y-values (average minus error and average plus error). Further, we
 want the vertices to be connected in a defined order. One can achieve
 this by going back and forth on the x-axis; we append a reversed
-version of the x-values to the original x-values using \code{cat} and
-\code{fliplr} for concatenation and inversion, respectively (line 3 in
+version of the x-values to the original x-values using \code{cat()} and
+\code{fliplr()} for concatenation and inversion, respectively (line 3 in
 listing \ref{errorbarlisting2}; Depending on the layout of your data
 you may need concatenate along a different dimension of the data and
-use \code{flipud} instead). The y-coordinates of the polygon vertices 
+use \code{flipud()} instead). The y-coordinates of the polygon vertices 
 are concatenated in a similar way (line 4). In the example shown here
 we accept the polygon object that is returned by fill (variable p) and
 use it to change a few properties of the polygon. The \emph{FaceAlpha}
@@ -561,9 +560,9 @@ connecting the average values (line 12).
 The \code[text()]{text()} or \code[annotation()]{annotation()} are
 used for highlighting certain parts of a plot or simply adding an
 annotation that does not fit or does not belong into the legend.
-While \varcode{text} simply prints out the given text string at the
+While \code{text()} simply prints out the given text string at the
 defined position (for example line in
-listing\,\ref{regularsubplotlisting}) the \varcode{annotation}
+listing\,\ref{regularsubplotlisting}) the \code{annotation()}
 function allows to add some more advanced highlights like arrows,
 lines, ellipses, or rectangles. Figure\,\ref{annotationsplot} shows
 some examples, the respective code can be found in
@@ -583,9 +582,9 @@ listing\,\ref{annotationsplotlisting}. For more options consult the
 
 \begin{important}[Positions in data or figure coordinates.]
   A very confusing pitfall are the different coordinate systems used
-  by \varcode{text} and \varcode{annotation}. While \varcode{text}
+  by \varcode{text()} and \varcode{annotation()}. While \varcode{text()}
   expects the positions to be in data coordinates, i.e.\,in the limits
-  of the x- and y-axis, \varcode{annotation} requires the positions to
+  of the x- and y-axis, \varcode{annotation()} requires the positions to
   be given in normalized figure coordinates. Normalized means that the
   width and height of the figure are expressed by numbers in the range
   0 to 1. The bottom/left corner then has the coordinates $(0,0)$ and
@@ -624,9 +623,9 @@ Lissajous figure. The basic steps are:
   is created and opened for writing. This also implies that is has to
   be closed after the whole process (line 31).
 \item For each frame of the video, we plot the appropriate data (we
-  use \code[scatter]{scatter} for this purpose, line 20) and ``grab''
+  use \code{scatter()} for this purpose, line 20) and ``grab''
   the frame (line 28). Grabbing is similar to making a screenshot of
-  the figure. The \code{drawnow}{drawnow} command (line 27) is used to
+  the figure. The \code{drawnow()} command (line 27) is used to
   stop the excution of the for loop until the drawing process is
   finished.
 \item Write the frame to file (line 29).
diff --git a/pointprocesses/lecture/pointprocesses.tex b/pointprocesses/lecture/pointprocesses.tex
index a8fae40..e658f8f 100644
--- a/pointprocesses/lecture/pointprocesses.tex
+++ b/pointprocesses/lecture/pointprocesses.tex
@@ -73,10 +73,10 @@ number of observed events within a certain time window $n_i$
 (\figref{pointprocessscetchfig}).
 
 \begin{exercise}{rasterplot.m}{}
-  Implement a function \code{rasterplot()} that displays the times of
-  action potentials within the first \code{tmax} seconds in a raster
+  Implement a function \varcode{rasterplot()} that displays the times of
+  action potentials within the first \varcode{tmax} seconds in a raster
   plot. The spike times (in seconds) recorded in the individual trials
-  are stored as vectors of times within a \codeterm{cell-array}.
+  are stored as vectors of times within a \codeterm{cell array}.
 \end{exercise}
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
@@ -95,10 +95,10 @@ describing the statistics of stochastic real-valued variables:
 \end{figure}
 
 \begin{exercise}{isis.m}{}
-  Implement a function \code{isis()} that calculates the interspike
+  Implement a function \varcode{isis()} that calculates the interspike
   intervals from several spike trains. The function should return a
   single vector of intervals. The spike times (in seconds) of each
-  trial are stored as vectors within a \codeterm{cell-array}.
+  trial are stored as vectors within a cell-array.
 \end{exercise}
 
 %\subsection{First order interval statistics}
@@ -117,7 +117,7 @@ describing the statistics of stochastic real-valued variables:
 \end{itemize}
 
 \begin{exercise}{isihist.m}{}
-  Implement a function \code{isiHist()} that calculates the normalized
+  Implement a function \varcode{isiHist()} that calculates the normalized
   interspike interval histogram. The function should take two input
   arguments; (i) a vector of interspike intervals and (ii) the width
   of the bins used for the histogram. It further returns the
@@ -126,7 +126,7 @@ describing the statistics of stochastic real-valued variables:
 
 \begin{exercise}{plotisihist.m}{}
   Implement a function that takes the return values of
-  \code{isiHist()} as input arguments and then plots the data. The
+  \varcode{isiHist()} as input arguments and then plots the data. The
   plot should show the histogram with the x-axis scaled to
   milliseconds and should be annotated with the average ISI, the
   standard deviation and the coefficient of variation.
@@ -167,7 +167,7 @@ $\rho_k$ is usually plotted against the lag $k$
 with itself and is always 1.
 
 \begin{exercise}{isiserialcorr.m}{}
-  Implement a function \code{isiserialcorr()} that takes a vector of
+  Implement a function \varcode{isiserialcorr()} that takes a vector of
   interspike intervals as input argument and calculates the serial
   correlation. The function should further plot the serial
   correlation. \pagebreak[4]
@@ -213,12 +213,12 @@ time interval , \determ{Feuerrate}) that is given in Hertz
 % \end{figure}
 
 \begin{exercise}{counthist.m}{}
-  Implement a function \code{counthist()} that calculates and plots
+  Implement a function \varcode{counthist()} that calculates and plots
   the distribution of spike counts observed in a certain time
   window. The function should take two input arguments: (i) a
-  \codeterm{cell-array} of vectors containing the spike times in
-  seconds observed in a number of trials, and (ii) the duration of the
-  time window that is used to evaluate the counts.\pagebreak[4]
+  cell-array of vectors containing the spike times in seconds observed
+  in a number of trials, and (ii) the duration of the time window that
+  is used to evaluate the counts.\pagebreak[4]
 \end{exercise}
 
 
@@ -244,7 +244,7 @@ In an \enterm[Poisson process!inhomogeneous]{inhomogeneous Poisson
 \lambda(t)$.
 
 \begin{exercise}{poissonspikes.m}{}
-  Implement a function \code{poissonspikes()} that uses a homogeneous
+  Implement a function \varcode{poissonspikes()} that uses a homogeneous
   Poisson process to generate events at a given rate for a certain
   duration and a number of trials. The rate should be given in Hertz
   and the duration of the trials is given in seconds. The function
@@ -293,7 +293,7 @@ The homogeneous Poisson process has the following properties:
 \end{itemize}
 
 \begin{exercise}{hompoissonspikes.m}{}
-  Implement a function \code{hompoissonspikes()} that uses a
+  Implement a function \varcode{hompoissonspikes()} that uses a
   homogeneous Poisson process to generate spike events at a given rate
   for a certain duration and a number of trials. The rate should be
   given in Hertz and the duration of the trials is given in
@@ -422,7 +422,7 @@ potentials (\figref{binpsthfig} top). The resulting histogram is then
 normalized with the bin width $W$ to yield the firing rate shown in
 the bottom trace of figure \ref{binpsthfig}. The above sketched
 process is equivalent to estimating the probability density. It is
-possible to estimate the PSTH using the \code{hist()} method
+possible to estimate the PSTH using the \code{hist()} function
 \sindex[term]{Feuerrate!Binningmethode}
 
 The estimated firing rate is valid for the total duration of each
diff --git a/programming/lecture/programming.tex b/programming/lecture/programming.tex
index 3e27b35..1137832 100644
--- a/programming/lecture/programming.tex
+++ b/programming/lecture/programming.tex
@@ -112,7 +112,7 @@ x  y  z
 
 \begin{important}[Naming conventions]
   There are a few rules regarding variable names. \matlab{} is
-  case-sensitive, i.e. \code{x} and \code{X} are two different
+  case-sensitive, i.e. \varcode{x} and \varcode{X} are two different
   names. Names must begin with an alphabetic character. German (or
   other) umlauts, special characters and spaces are forbidden in
   variable names.
@@ -689,9 +689,11 @@ then compare it to the elements on each page, and so on. An
 alternative way is to make use of the so called \emph{linear indexing}
 in which each element of the matrix is addressed by a single
 number. The linear index thus ranges from 1 to
-\code{numel(matrix)}. The linear index increases first along the 1st,
-2nd, 3rd etc. dimension (figure~\ref{matrixlinearindexingfig}). It is
-not as intuitive since one would need to know the shape of the matrix and perform a remapping, but can be really helpful
+\code[numel()]{numel(matrix)}. The linear index increases first along
+the 1st, 2nd, 3rd etc. dimension
+(figure~\ref{matrixlinearindexingfig}). It is not as intuitive since
+one would need to know the shape of the matrix and perform a
+remapping, but can be really helpful
 (listing~\ref{matrixLinearIndexing}).
 
 
@@ -882,10 +884,11 @@ table~\ref{logicaloperators}) which are introduced in the following
 sections.
 
 \subsection{Relational operators}
-With \codeterm[Operator!relational]{relational operators} (table~\ref{relationaloperators})
-we can ask questions such as: ''Is the value of variable \code{a}
-larger than the value of \code{b}?'' or ``Is the value in \code{a}
-equal to the one stored in variable \code{b}?''. 
+With \codeterm[Operator!relational]{relational operators}
+(table~\ref{relationaloperators}) we can ask questions such as: ''Is
+the value of variable \varcode{a} larger than the value of
+\varcode{b}?'' or ``Is the value in \varcode{a} equal to the one
+stored in variable \varcode{b}?''.
 
 \begin{table}[h!]
    \titlecaption{\label{relationaloperators}
@@ -930,13 +933,13 @@ Testing the relations between numbers and scalar variables is straight
 forward. When comparing vectors, the relational operator will be
 applied element-wise and compare the respective elements of the
 left-hand-side and right-hand-side vectors. Note: vectors must have
-the same length and orientation. The result of \code{[2 0 0 5 0] == [1
+the same length and orientation. The result of \varcode{[2 0 0 5 0] == [1
     0 3 2 0]'} in which the second vector is transposed to give a
 column vector is a matrix!
 
 \subsection{Logical operators}
 With the relational operators we could for example test whether a
-number is greater than a certain threshold (\code{x > 0.25}). But what
+number is greater than a certain threshold (\varcode{x > 0.25}). But what
 if we wanted to check whether the number falls into the range greater
 than 0.25 but less than 0.75? Numbers that fall into this range must
 satisfy the one and the other condition. With
@@ -1068,13 +1071,13 @@ values stored in a vector or matrix. It is very powerful and, once
 understood, very intuitive.
 
 The basic concept is that applying a Boolean operation on a vector
-results in a \code{logical} vector of the same size (see
+results in a \codeterm{logical} vector of the same size (see
 listing~\ref{logicaldatatype}). This logical vector is then used to
 select only those values for which the logical vector is true. Line 14
 in listing~\ref{logicalindexing1} can be read: ``Select all those
 elements of \varcode{x} where the Boolean expression \varcode{x < 0}
 evaluates to true and store the result in the variable
-\emph{x\_smaller\_zero}''.
+\varcode{x\_smaller\_zero}''.
 
 \begin{lstlisting}[caption={Logical indexing.}, label=logicalindexing1]
 >> x = randn(1, 6)   % a vector with 6 random numbers
@@ -1154,13 +1157,14 @@ segment of data of a certain time span (the stimulus was on,
   data and metadata in a single variable.
 
   \textbf{Cell arrays} Arrays of variables that contain different
-  types. Unlike structures, the entries of a \codeterm{Cell array} are
-  not named. Indexing in \codeterm{Cell arrays} requires a special
-  operator the \code{\{\}}. \matlab{} uses \codeterm{Cell arrays} for
-  example when strings of different lengths should be stored in the
-  same variable: \varcode{months = \{'Januar', 'February', 'March',
-    'April', 'May', 'Jun'\};}.  Note the curly braces that are used to
-  create the array and are also used for indexing.
+  types. Unlike structures, the entries of a \codeterm{cell array} are
+  not named. Indexing in \codeterm[cell array]{cell arrays} requires a
+  special operator the \code{\{\}}. \matlab{} uses \codeterm[cell
+  array]{cell arrays} for example when strings of different lengths
+  should be stored in the same variable: \varcode{months = \{'Januar',
+    'February', 'March', 'April', 'May', 'Jun'\};}.  Note the curly
+  braces that are used to create the array and are also used for
+  indexing.
 
   \textbf{Tables} Tabular structure that allows to have columns of
   varying type combined with a header (much like a spreadsheet).
@@ -1170,8 +1174,8 @@ segment of data of a certain time span (the stimulus was on,
   irregular intervals togehter with the measurement time in a single
   variable. Without the \codeterm{Timetable} data type at least two
   variables (one storing the time, the other the measurement) would be
-  required. \codeterm{Timetables} offer specific convenience functions
-  to work with timestamps.
+  required. \codeterm[Timetable]{Timetables} offer specific
+  convenience functions to work with timestamps.
 
   \textbf{Maps} In a \codeterm{map} a \codeterm{value} is associated
   with an arbitrary \codeterm{key}. The \codeterm{key} is not
@@ -1246,7 +1250,7 @@ All imperative programming languages offer a solution: the loop. It is
 used whenever the same commands have to be repeated.
 
 
-\subsubsection{The \code{for} --- loop}
+\subsubsection{The \varcode{for} --- loop}
 The most common type of loop is the \codeterm{for-loop}. It
 consists of a \codeterm[Loop!head]{head} and the
 \codeterm[Loop!body]{body}. The head defines how often the code in the
@@ -1258,7 +1262,7 @@ next value of this vector. In the body of the loop any code can be
 executed which may or may not use the running variable for a certain
 purpose. The \code{for} loop is closed with the keyword
 \code{end}. Listing~\ref{looplisting} shows a simple version of such a
-\code{for} loop.
+\codeterm{for-loop}.
 
 \begin{lstlisting}[caption={Example of a \varcode{for}-loop.}, label=looplisting]
 >> for x = 1:3 % head
@@ -1273,15 +1277,15 @@ purpose. The \code{for} loop is closed with the keyword
 
 
 \begin{exercise}{factorialLoop.m}{factorialLoop.out}
-  Can we solve the factorial with a for-loop? Implement a for loop that
-  calculates the factorial of a number \varcode{n}.
+  Can we solve the factorial with a \varcode{for}-loop? Implement a
+  for loop that calculates the factorial of a number \varcode{n}.
 \end{exercise}
 
 
 \subsubsection{The \varcode{while} --- loop}
 
-The \code{while}--loop is the second type of loop that is available in
-almost all programming languages. Other, than the \code{for} -- loop,
+The \codeterm{while-loop} is the second type of loop that is available in
+almost all programming languages. Other, than the \codeterm{for-loop},
 that iterates with the running variable over a vector, the while loop
 uses a Boolean expression to determine when to execute the code in
 it's body. The head of the loop starts with the keyword \code{while}
@@ -1289,22 +1293,22 @@ that is followed by a Boolean expression. If this can be evaluated to
 true, the code in the body is executed. The loop is closed with an
 \code{end}.
 
-\begin{lstlisting}[caption={Basic structure of a \code{while} loop.}, label=whileloop]
+\begin{lstlisting}[caption={Basic structure of a \varcode{while} loop.}, label=whileloop]
 while x == true % head with a Boolean expression
    % execute this code if the expression yields true
 end
 \end{lstlisting}
 
 \begin{exercise}{factorialWhileLoop.m}{}
-  Implement the factorial of a number \varcode{n} using a \code{while}
-  -- loop.
+  Implement the factorial of a number \varcode{n} using a \varcode{while}-loop.
 \end{exercise}
 
 
 \begin{exercise}{neverendingWhile.m}{}
-  Implement a \code{while}--loop that is never-ending. Hint: the body
-  is executed as long as the Boolean expression in the head is
-  \code{true}. You can escape the loop by pressing \keycode{Ctrl+C}.
+  Implement a \varcode{while}-loop that is never-ending. Hint: the
+  body is executed as long as the Boolean expression in the head is
+  \varcode{true}. You can escape the loop by pressing
+  \keycode{Ctrl+C}.
 \end{exercise}
 
 
@@ -1312,15 +1316,15 @@ end
 
 \begin{itemize}
 \item Both execute the code in the body iterative.
-\item When using a \code{for} -- loop the body of the loop is executed
+\item When using a \code{for}-loop the body of the loop is executed
   at least once (except when the vector used in the head is empty).
-\item In a \code{while} -- loop, the body is not necessarily
+\item In a \code{while}-loop, the body is not necessarily
   executed. It is entered only if the Boolean expression in the head
   yields true.
-\item The \code{for} -- loop is best suited for cases in which the
+\item The \code{for}-loop is best suited for cases in which the
   elements of a vector have to be used for a computation or when the
   number of iterations is known.
-\item The \code{while} -- loop is best suited for cases when it is not
+\item The \code{while}-loop is best suited for cases when it is not
   known in advance how often a certain piece of code has to be
   executed.
 \item Any problem that can be solved with one type can also be solve
@@ -1336,8 +1340,8 @@ is only executed under a certain condition.
 \subsubsection{The \varcode{if} -- statement}
 
 The most prominent representative of the conditional expressions is
-the \code{if} statement (sometimes also called \code{if - else}
-statement). It constitutes a kind of branching point. It allows to
+the \codeterm{if statement} (sometimes also called \codeterm{if - else
+statement}). It constitutes a kind of branching point. It allows to
 control which branch of the code is executed.
 
 Again, the statement consists of the head and the body. The head
@@ -1346,11 +1350,11 @@ that controls whether or not the body is entered. Optionally, the body
 can be either ended by the \code{end} keyword or followed by
 additional statements \code{elseif}, which allows to add another
 Boolean expression and to catch another condition or the \code{else}
-the provide a default case. The last body of the \code{if - elseif -
+the provide a default case. The last body of the \varcode{if - elseif -
   else} statement has to be finished with the \code{end}
 (listing~\ref{ifelselisting}).
   
-\begin{lstlisting}[label=ifelselisting, caption={Structure of an \code{if} statement.}]
+\begin{lstlisting}[label=ifelselisting, caption={Structure of an \varcode{if} statement.}]
 if x < y % head
    % body I, executed only if x < y
 elseif x > y
@@ -1361,7 +1365,7 @@ end
   \end{lstlisting}
 
 \begin{exercise}{ifelse.m}{}
-  Draw a random number and check with an appropriate \code{if}
+  Draw a random number and check with an appropriate \varcode{if}
   statement whether it is
   \begin{enumerate}
   \item less than 0.5.
@@ -1373,9 +1377,9 @@ end
 
 \subsubsection{The  \varcode{switch} -- statement} 
 
-The \code{switch} statement is used whenever a set of conditions
+The \codeterm{switch statement} is used whenever a set of conditions
 requires separate treatment. The statement is initialized with the
-\code{switch} keyword that is followed by \emph{switch expression} (a
+\code{switch} keyword that is followed by a \emph{switch expression} (a
 number or string). It is followed by a set of \emph{case expressions}
 which start with the keyword \code{case} followed by the condition
 that defines against which the \emph{switch expression} is tested. It
@@ -1412,7 +1416,7 @@ end
 \end{itemize}
 
 
-\subsection{The keywords \code{break} and \code{continue}}
+\subsection{The keywords \varcode{break} and \varcode{continue}}
 
 Whenever the execution of a loop should be ended or if you want to
 skip the execution of the body under certain circumstances, one can
@@ -1458,7 +1462,7 @@ end
   has passed between the calls of \code{tic} and \code{toc}.
 
   \begin{enumerate}
-  \item Use a \code{for} loop to select matching values.
+  \item Use a \varcode{for} loop to select matching values.
   \item Use logical indexing.
   \end{enumerate}
 \end{exercise}
@@ -1486,12 +1490,12 @@ and executed line-by-line from top to bottom.
 
 \matlab{} knows three types of programs:
 \begin{enumerate}
-\item \codeterm[Script]{Scripts}
-\item \codeterm[Function]{Functions}
-\item \codeterm[Object]{Objects} (not covered here)
+\item \entermde[script]{Skripte}{Scripts}
+\item \entermde[function]{Funktion}{Functions}
+\item \entermde[Object]{Objekte}{Objects} (not covered here)
 \end{enumerate}
 
-Programs are stored in so called \codeterm{m-files}
+Programs are stored in so called \codeterm[m-file]{m-files}
 (e.g. \file{myProgram.m}). To use them they have to be \emph{called}
 from the command line or from within another program. Storing your code in
 programs increases the re-usability. So far we have used
@@ -1507,13 +1511,13 @@ and if it now wants to read the previously stored variable, it will
 contain a different value than expected. Bugs like this are hard to
 find since each of the programs alone is perfectly fine and works as
 intended. A solution for this problem are the
-\codeterm[Function]{functions}.
+\entermde[function]{Funktion}{functions}.
 
 \subsection{Functions}
 
 Functions in \matlab{} are similar to mathematical functions 
 \[ y = f(x) \] Here, the mathematical function has the name $f$ and it
-has one \codeterm{argument} $x$ that is transformed into the
+has one \entermde{Argument}{argument} $x$ that is transformed into the
 function's output value $y$. In \matlab{} the syntax of a function
 declaration is very similar (listing~\ref{functiondefinitionlisting}).
 
@@ -1524,12 +1528,12 @@ function [y] = functionName(arg_1, arg_2)
 \end{lstlisting}
 
 The keyword \code{function} is followed by the return value(s) (it can
-be a list \code{[]} of values), the function name and the
+be a list \varcode{[]} of values), the function name and the
 argument(s). The function head is then followed by the function's
 body. A function is ended by and \code{end} (this is in fact optional
 but we will stick to this). Each function that should be directly used
 by the user (or called from other programs) should reside in an
-individual \code{m-file} that has the same name as the function. By
+individual \codeterm{m-file} that has the same name as the function. By
 using functions instead of scripts we gain several advantages:
 \begin{itemize}
 \item Encapsulation of program code that solves a certain task. It can
@@ -1566,10 +1570,9 @@ function myFirstFunction() % function head
 end
 \end{lstlisting}
 
-\code{myFirstFunction} (listing~\ref{badsinewavelisting}) is a
+\varcode{myFirstFunction} (listing~\ref{badsinewavelisting}) is a
 prime-example of a bad function. There are several issues with it's
 design:
-
 \begin{itemize}
 \item The function's name does not tell anything about it's purpose.
 \item The function is made for exactly one use-case (frequency of
@@ -1594,7 +1597,7 @@ defined:
   (e.g. the user of another program that calls a function)?
 \end{enumerate}
 
-As indicated above the \code{myFirstFunction} does three things at
+As indicated above the \varcode{myFirstFunction} does three things at
 once, it seems natural, that the task should be split up into three
 parts. (i) Calculation of the individual sine waves defined by the
 frequency and the amplitudes (ii) graphical display of the data and
@@ -1607,17 +1610,18 @@ define (i) how to name the function, (ii) which information it needs
 (arguments), and (iii) what it should return to the caller.
 
 \begin{enumerate}
-\item \codeterm[Function!Name]{Name}: the name should be descriptive
+\item \entermde[function!name]{Funktion!-sname}{Name}: the name should be descriptive
   of the function's purpose, i.e. the calculation of a sine wave. A
-  appropriate name might be \code{sinewave()}.
-\item \codeterm[Function!Arguments]{Arguments}: What information does
-  the function need to do the calculation? There are obviously the
-  frequency as well as the amplitude. Further we may want to be able
-  to define the duration of the sine wave and the temporal
-  resolution. We thus need four arguments which should also named to
-  describe their content: \code{amplitude, frequency, t\_max,} and
-  \code{t\_step} might be good names.
-\item \codeterm[Function!Return values]{Return values}: For a correct
+  appropriate name might be \varcode{sinewave()}.
+\item \entermde[function!arguments]{Funktion!-sargument}{Arguments}:
+  What information does the function need to do the calculation? There
+  are obviously the frequency as well as the amplitude. Further we may
+  want to be able to define the duration of the sine wave and the
+  temporal resolution. We thus need four arguments which should also
+  named to describe their content: \varcode{amplitude},
+  \varcode{frequency}, \varcode{t\_max}, and \varcode{t\_step} might
+  be good names.
+\item \entermde[function!return values]{Funktion!R\"uckgabewerte}{Return values}: For a correct
   display of the data we need two vectors. The time, and the sine wave
   itself. We just need two return values: \varcode{time}, \varcode{sine}
 \end{enumerate}
@@ -1657,11 +1661,11 @@ specification of the function:
 
 \begin{enumerate}
 \item It should plot a single sine wave. But it is not limited to sine
-  waves. It's name is thus: \code{plotFunction()}.
+  waves. It's name is thus: \varcode{plotFunction()}.
 \item What information does it need to solve the task? The
-  to-be-plotted data as there is the values \code{y\_data} and the
-  corresponding \code{x\_data}. As we want to plot series of sine
-  waves we might want to have a \code{name} for each function to be
+  to-be-plotted data as there is the values \varcode{y\_data} and the
+  corresponding \varcode{x\_data}. As we want to plot series of sine
+  waves we might want to have a \varcode{name} for each function to be
   displayed in the figure legend.
 \item Are there any return values? No, this function is just made for
   plotting, we do not need to return anything.
@@ -1699,11 +1703,11 @@ Again, we need to specify what needs to be done:
   appropriate name for the script (that is the name of the m-file)
   might be \file{plotMultipleSinewaves.m}.
 \item What information do we need? we need to define the
-  \code{frequency}, the range of \code{amplitudes}, the
-  \code{duration} of the sine waves, and the temporal resolution given
-  as the time between to points in time, i.e. the \code{stepsize}.
+  \varcode{frequency}, the range of \varcode{amplitudes}, the
+  \varcode{duration} of the sine waves, and the temporal resolution given
+  as the time between to points in time, i.e. the \varcode{stepsize}.
 \item We then need to create an empty figure, and work through the
-  rang of \code{amplitudes}. We must not forget to switch \code{hold
+  rang of \varcode{amplitudes}. We must not forget to switch \varcode{hold
     on} if we want to see all the sine waves in one plot.
 \end{enumerate}
 
diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex
index 74c1392..7e020ae 100644
--- a/regression/lecture/regression.tex
+++ b/regression/lecture/regression.tex
@@ -33,7 +33,7 @@ fitting approaches. We will apply this method to find the combination
 of slope and intercept that best describes the system.
 
 
-\section{The error function --- mean square error}
+\section{The error function --- mean squared error}
 
 Before the optimization can be done we need to specify what is
 considered an optimal fit. In our example we search the parameter
@@ -57,25 +57,23 @@ $\sum_{i=1}^N |y_i - y^{est}_i|$. The total error can only be small if
 all deviations are indeed small no matter if they are above or below
 the prediced line. Instead of the sum we could also ask for the
 \emph{average}
-
 \begin{equation}
   \label{meanabserror}
   f_{dist}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N |y_i - y^{est}_i|
 \end{equation}
 should be small. Commonly, the \enterm{mean squared distance} oder
-\enterm{mean squared error}
+\enterm[square error!mean]{mean square error} (\determ[quadratischer Fehler!mittlerer]{mittlerer quadratischer Fehler})
 \begin{equation}
   \label{meansquarederror}
   f_{mse}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N (y_i - y^{est}_i)^2
 \end{equation}
-
 is used (\figref{leastsquareerrorfig}). Similar to the absolute
 distance, the square of the error($(y_i - y_i^{est})^2$) is always
 positive error values do not cancel out. The square further punishes
 large deviations.
 
 \begin{exercise}{meanSquareError.m}{}\label{mseexercise}%
-  Implement a function \code{meanSquareError()}, that calculates the
+  Implement a function \varcode{meanSquareError()}, that calculates the
   \emph{mean square distance} between a vector of observations ($y$)
   and respective predictions ($y^{est}$).
 \end{exercise}
@@ -84,18 +82,19 @@ large deviations.
 \section{\tr{Objective function}{Zielfunktion}}
 
 $f_{cost}(\{(x_i, y_i)\}|\{y^{est}_i\})$ is a so called
-\enterm{objective function} or \enterm{cost function}. We aim to adapt
-the model parameters to minimize the error (mean square error) and
-thus the \emph{objective function}. In Chapter~\ref{maximumlikelihoodchapter}
-we will show that the minimization of the mean square error is
-equivalent to maximizing the likelihood that the observations
-originate from the model (assuming a normal distribution of the data
-around the model prediction).
+\enterm{objective function} or \enterm{cost function}
+(\determ{Kostenfunktion}). We aim to adapt the model parameters to
+minimize the error (mean square error) and thus the \emph{objective
+  function}. In Chapter~\ref{maximumlikelihoodchapter} we will show
+that the minimization of the mean square error is equivalent to
+maximizing the likelihood that the observations originate from the
+model (assuming a normal distribution of the data around the model
+prediction).
 
 \begin{figure}[t]
   \includegraphics[width=1\textwidth]{linear_least_squares}
   \titlecaption{Estimating the \emph{mean square error}.}  {The
-    deviation (\enterm{error}, orange) between the prediction (red
+    deviation error, orange) between the prediction (red
     line) and the observations (blue dots) is calculated for each data
     point (left). Then the deviations are squared and the aveage is
     calculated (right).}
@@ -119,11 +118,13 @@ Replacing $y^{est}$ with the linear equation (the model) in
 
 That is, the mean square error is given the pairs $(x_i, y_i)$ and the
 parameters $m$ and $b$ of the linear equation. The optimization
-process will not try to optimize $m$ and $b$ to lead to the smallest
-error, the method of the \enterm{least square error}.
+process tries to optimize $m$ and $b$ such that the error is
+minimized, the method of the \enterm[square error!least]{least square
+  error} (\determ[quadratischer Fehler!kleinster]{Methode der
+  kleinsten Quadrate}).
 
 \begin{exercise}{lsqError.m}{}
-  Implement the objective function \code{lsqError()} that applies the
+  Implement the objective function \varcode{lsqError()} that applies the
   linear equation as a model.
   \begin{itemize}
   \item The function takes three arguments. The first is a 2-element
@@ -131,7 +132,7 @@ error, the method of the \enterm{least square error}.
     \varcode{b}. The second is a vector of x-values the third contains
     the measurements for each value of $x$, the respecive $y$-values.
   \item The function returns the mean square error \eqnref{mseline}.
-  \item The function should call the function \code{meanSquareError()}
+  \item The function should call the function \varcode{meanSquareError()}
     defined in the previouos exercise to calculate the error.
   \end{itemize}
 \end{exercise}
@@ -165,7 +166,7 @@ third dimension is used to indicate the error value
   \varcode{y}). Implement a script \file{errorSurface.m}, that
   calculates the mean square error between data and a linear model and
   illustrates the error surface using the \code{surf()} function
-  (consult the help to find out how to use \code{surf}.).
+  (consult the help to find out how to use \code{surf()}.).
 \end{exercise}
 
 By looking at the error surface we can directly see the position of
@@ -257,7 +258,7 @@ way to the minimum of the objective function. The ball will always
 follow the steepest slope. Thus we need to figure out the direction of
 the steepest slope at the position of the ball.
 
-The \enterm{gradient} (Box~\ref{partialderivativebox}) of the
+The \entermde{Gradient}{gradient} (Box~\ref{partialderivativebox}) of the
 objective function is the vector
 
 \[ \nabla f_{cost}(m,b) = \left( \frac{\partial f(m,b)}{\partial m},
@@ -296,7 +297,7 @@ choose the opposite direction.
 \end{figure}
 
 \begin{exercise}{lsqGradient.m}{}\label{gradientexercise}%
-  Implement a function \code{lsqGradient()}, that takes the set of
+  Implement a function \varcode{lsqGradient()}, that takes the set of
   parameters $(m, b)$ of the linear equation as a two-element vector
   and the $x$- and $y$-data as input arguments. The function should
   return the gradient at that position.
@@ -316,8 +317,8 @@ choose the opposite direction.
 Finally, we are able to implement the optimization itself. By now it
 should be obvious why it is called the gradient descent method. All
 ingredients are already there. We need: 1. The error function
-(\code{meanSquareError}), 2. the objective function
-(\code{lsqError()}), and 3. the gradient (\code{lsqGradient()}). The
+(\varcode{meanSquareError}), 2. the objective function
+(\varcode{lsqError()}), and 3. the gradient (\varcode{lsqGradient()}). The
 algorithm of the gradient descent is:
 
 \begin{enumerate}
diff --git a/scientificcomputing-script.tex b/scientificcomputing-script.tex
index 35d78fc..65f1a22 100644
--- a/scientificcomputing-script.tex
+++ b/scientificcomputing-script.tex
@@ -9,6 +9,8 @@
   \include{#1/lecture/#1}%
 }
 
+%\includeonly{regression/lecture/regression}
+
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \begin{document} 
diff --git a/statistics/lecture/statistics.tex b/statistics/lecture/statistics.tex
index 642aa49..1af9ef6 100644
--- a/statistics/lecture/statistics.tex
+++ b/statistics/lecture/statistics.tex
@@ -7,17 +7,16 @@ Descriptive statistics characterizes data sets by means of a few measures.
 In addition to histograms that estimate the full distribution of the data,
 the following measures are used for characterizing univariate data:
 \begin{description}
-\item[Location, central tendency] (``Lagema{\ss}e''):
-  arithmetic mean, median, mode.
-\item[Spread, dispersion] (``Streuungsma{\ss}e''): variance,
-  standard deviation, inter-quartile range,\linebreak coefficient of variation
-  (``Variationskoeffizient'').
-\item[Shape]: skewness (``Schiefe''), kurtosis (``W\"olbung'').
+\item[Location, central tendency] (\determ{Lagema{\ss}e}):
+  \entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean}, \entermde{Median}{median}, \enterm{mode}.
+\item[Spread, dispersion] (\determ{Streuungsma{\ss}e}): \entermde{Varianz}{variance},
+  \entermde{Standardabweichung}{standard deviation}, inter-quartile range,\linebreak \enterm{coefficient of variation} (\determ{Variationskoeffizient}).
+\item[Shape]: \enterm{skewness} (\determ{Schiefe}), \enterm{kurtosis} (\determ{W\"olbung}).
 \end{description}
 For bivariate and multivariate data sets we can also analyse their
 \begin{description}
-\item[Dependence, association] (``Zusammenhangsma{\ss}e''): Pearson's correlation coefficient,
-  Spearman's rank correlation coefficient.
+\item[Dependence, association] (\determ{Zusammenhangsma{\ss}e}): \entermde[correlation!coefficient!Pearson's]{Korrelation!Pearson}{Pearson's correlation coefficient},
+  \entermde[correlation!coefficient!Spearman's rank]{{Rangkorrelationskoeffizient!Spearman'scher}}{Spearman's rank correlation coefficient}.
 \end{description}
 
 The following is in no way a complete introduction to descriptive
@@ -26,15 +25,16 @@ daily data-analysis problems.
 
 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 \section{Mean, variance, and standard deviation}
-The \enterm{arithmetic mean} is a measure of location. For $n$ data values
-$x_i$ the arithmetic mean is computed by
+The \entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean}
+is a measure of location. For $n$ data values $x_i$ the arithmetic
+mean is computed by
 \[ \bar x = \langle x \rangle = \frac{1}{N}\sum_{i=1}^n x_i \; . \]
 This computation (summing up all elements of a vector and dividing by
 the length of the vector) is provided by the function \mcode{mean()}.
 The mean has the same unit as the data values.
 
 The dispersion of the data values around the mean is quantified by
-their \enterm{variance}
+their \entermde{Varianz}{variance}
 \[ \sigma^2_x = \langle (x-\langle x \rangle)^2 \rangle = \frac{1}{N}\sum_{i=1}^n (x_i - \bar x)^2 \; . \]
 The variance is computed by the function \mcode{var()}.
 The unit of the variance is the unit of the data values squared.
@@ -42,14 +42,15 @@ Therefore, variances cannot be compared to the mean or the data values
 themselves. In particular, variances cannot be used for plotting error
 bars along with the mean.
 
-The standard deviation
-\[ \sigma_x = \sqrt{\sigma^2_x} \; , \] 
-as computed by the function \mcode{std()}, however, has the same unit
-as the data values and can (and should) be used to display the
-dispersion of the data together with their mean.
+In contrast to the variance, the
+\entermde{Standardabweichung}{standard deviation}
+\[ \sigma_x = \sqrt{\sigma^2_x} \; , \]
+as computed by the function \mcode{std()} has the same unit as the
+data values and can (and should) be used to display the dispersion of
+the data together with their mean.
 
 The mean of a data set can be displayed by a bar-plot
-\matlabfun{bar()}. Additional errorbars \matlabfun{errobar()} can be
+\matlabfun{bar()}. Additional errorbars \matlabfun{errorbar()} can be
 used to illustrate the standard deviation of the data
 (\figref{displayunivariatedatafig} (2)).
 
@@ -90,18 +91,18 @@ used to illustrate the standard deviation of the data
     identical with the mode.}
 \end{figure}
 
-The \enterm{mode} is the most frequent value, i.e. the position of the maximum of the probability distribution.
+The \enterm{mode} (\determ{Modus}) is the most frequent value,
+i.e. the position of the maximum of the probability distribution.
 
-The \enterm{median} separates a list of data values into two halves
-such that one half of the data is not greater and the other half is
-not smaller than the median (\figref{medianfig}).
+The \entermde{Median}{median} separates a list of data values into two
+halves such that one half of the data is not greater and the other
+half is not smaller than the median (\figref{medianfig}).  The
+function \mcode{median()} computes the median.
 
 \begin{exercise}{mymedian.m}{}
   Write a function \varcode{mymedian()} that computes the median of a vector.
 \end{exercise}
 
-\matlab{} provides the function \code{median()} for computing the median.
-
 \begin{exercise}{checkmymedian.m}{}
   Write a script that tests whether your median function really
   returns a median above which are the same number of data than
@@ -122,9 +123,9 @@ not smaller than the median (\figref{medianfig}).
 \end{figure}
 
 The distribution of data can be further characterized by the position
-of its \enterm[quartile]{quartiles}. Neighboring quartiles are
+of its \entermde[quartile]{Quartil}{quartiles}. Neighboring quartiles are
 separated by 25\,\% of the data (\figref{quartilefig}).
-\enterm[percentile]{Percentiles} allow to characterize the
+\entermde[percentile]{Perzentil}{Percentiles} allow to characterize the
 distribution of the data in more detail. The 3$^{\rm rd}$ quartile
 corresponds to the 75$^{\rm th}$ percentile, because 75\,\% of the
 data are smaller than the 3$^{\rm rd}$ quartile.
@@ -147,11 +148,12 @@ data are smaller than the 3$^{\rm rd}$ quartile.
 %     from a normal distribution.}
 % \end{figure}
 
-\enterm[box-whisker plots]{Box-whisker plots} are commonly used to
-visualize and compare the distribution of unimodal data. A box is
-drawn around the median that extends from the 1$^{\rm st}$ to the
-3$^{\rm rd}$ quartile. The whiskers mark the minimum and maximum value
-of the data set (\figref{displayunivariatedatafig} (3)).
+\entermde[box-whisker plots]{Box-Whisker-Plot}{Box-whisker plots}, or
+\entermde{Box-Plot}{box plot} are commonly used to visualize and
+compare the distribution of unimodal data. A box is drawn around the
+median that extends from the 1$^{\rm st}$ to the 3$^{\rm rd}$
+quartile. The whiskers mark the minimum and maximum value of the data
+set (\figref{displayunivariatedatafig} (3)).
 
 \begin{exercise}{univariatedata.m}{}
   Generate 40 normally distributed random numbers with a mean of 2 and
@@ -170,13 +172,14 @@ of the data set (\figref{displayunivariatedatafig} (3)).
 % \end{exercise}
 
 \section{Distributions}
-The distribution of values in a data set is estimated by histograms
-(\figref{displayunivariatedatafig} (4)).
+The \enterm{distribution} (\determ{Verteilung}) of values in a data
+set is estimated by histograms (\figref{displayunivariatedatafig}
+(4)).
 
 \subsection{Histograms}
 
-\enterm[histogram]{Histograms} count the frequency $n_i$ of
-$N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$
+\entermde[histogram]{Histogramm}{Histograms} count the frequency $n_i$
+of $N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$
 (\figref{diehistogramsfig} left).  The bins tile the data range
 usually into intervals of the same size. The width of the bins is
 called the bin width. The frequencies $n_i$ plotted against the
@@ -194,13 +197,14 @@ categories $i$ is the \enterm{histogram}, or the \enterm{frequency
 \end{figure}
 
 Histograms are often used to estimate the
-\enterm[probability!distribution]{probability distribution} of the
-data values.
+\enterm[probability!distribution]{probability distribution}
+(\determ[Wahrscheinlichkeits!-verteilung]{Wahrscheinlichkeitsverteilung}) of the data values.
 
 \subsection{Probabilities}
-In the frequentist interpretation of probability, the probability of
-an event (e.g. getting a six when rolling a die) is the relative
-occurrence of this event in the limit of a large number of trials.
+In the frequentist interpretation of probability, the
+\enterm{probability} (\determ{Wahrscheinlichkeit}) of an event
+(e.g. getting a six when rolling a die) is the relative occurrence of
+this event in the limit of a large number of trials.
 
 For a finite number of trials $N$ where the event $i$ occurred $n_i$
 times, the probability $P_i$ of this event is estimated by
@@ -212,15 +216,16 @@ the sum of the probabilities of all possible events is one:
 i.e. the probability of getting any event is one.
 
 
-\subsection{Probability distributions of categorial data}
+\subsection{Probability distributions of categorical data}
 
-For categorial data values (e.g. the faces of a die (as integer
-numbers or as colors)) a bin can be defined for each category $i$.
-The histogram is normalized by the total number of measurements to
-make it independent of the size of the data set
-(\figref{diehistogramsfig}). After this normalization the height of
-each histogram bar is an estimate of the probability $P_i$ of the
-category $i$, i.e. of getting a data value in the $i$-th bin.
+For \entermde[data!categorical]{Daten!kategorische}{categorical} data
+values (e.g. the faces of a die (as integer numbers or as colors)) a
+bin can be defined for each category $i$.  The histogram is normalized
+by the total number of measurements to make it independent of the size
+of the data set (\figref{diehistogramsfig}). After this normalization
+the height of each histogram bar is an estimate of the probability
+$P_i$ of the category $i$, i.e. of getting a data value in the $i$-th
+bin.
 
 \begin{exercise}{rollthedie.m}{}
   Write a function that simulates rolling a die $n$ times.
@@ -236,12 +241,14 @@ category $i$, i.e. of getting a data value in the $i$-th bin.
 
 \subsection{Probability densities functions}
 
-In cases where we deal with data sets of measurements of a real
-quantity (e.g. lengths of snakes, weights of elephants, times
-between succeeding spikes) there is no natural bin width for computing
-a histogram. In addition, the probability of measuring a data value that
-equals exactly a specific real number like, e.g., 0.123456789 is zero, because
-there are uncountable many real numbers.
+In cases where we deal with
+\entermde[data!continuous]{Daten!kontinuierliche}{continuous data},
+(measurements of real-valued quantities, e.g. lengths of snakes,
+weights of elephants, times between succeeding spikes) there is no
+natural bin width for computing a histogram. In addition, the
+probability of measuring a data value that equals exactly a specific
+real number like, e.g., 0.123456789 is zero, because there are
+uncountable many real numbers.
 
 We can only ask for the probability to get a measurement value in some
 range.  For example, we can ask for the probability $P(1.2<x<1.3)$ to
@@ -254,14 +261,14 @@ probability can also be expressed as $P(x_0<x<x_0 + \Delta x)$.
 In the limit to very small ranges $\Delta x$ the probability of
 getting a measurement between $x_0$ and $x_0+\Delta x$ scales down to
 zero with $\Delta x$:
-\[ P(x_0<x<x_0+\Delta x) \approx p(x_0) \cdot \Delta x \; . \]
-In here the quantity $p(x_00)$ is a so called
-\enterm[probability!density]{probability density} that is larger than
-zero and that describes the distribution of the data values. The
-probability density is not a unitless probability with values between
-0 and 1, but a number that takes on any positive real number and has
-as a unit the inverse of the unit of the data values --- hence the
-name ``density''.
+\[ P(x_0<x<x_0+\Delta x) \approx p(x_0) \cdot \Delta x \; . \] In here
+the quantity $p(x_00)$ is a so called
+\enterm[probability!density]{probability density}
+(\determ[Wahrscheinlichkeits!-dichte]{Wahrscheinlichkeitsdichte}) that is larger than zero and that
+describes the distribution of the data values. The probability density
+is not a unitless probability with values between 0 and 1, but a
+number that takes on any positive real number and has as a unit the
+inverse of the unit of the data values --- hence the name ``density''.
 
 \begin{figure}[t]
   \includegraphics[width=1\textwidth]{pdfprobabilities}
@@ -282,17 +289,18 @@ the probability density over the whole real axis must be one:
 \end{equation}
 
 The function $p(x)$, that assigns to every $x$ a probability density,
-is called \enterm[probability!density function]{probability density function},
-\enterm[pdf|see{probability density function}]{pdf}, or just
-\enterm[density|see{probability density function}]{density}
-(\determ{Wahrscheinlichkeitsdichtefunktion}). The well known
-\enterm{normal distribution} (\determ{Normalverteilung}) is an example of a
-probability density function
+is called \enterm[probability!density function]{probability density
+  function}, \enterm[pdf|see{probability density function}]{pdf}, or
+just \enterm[density|see{probability density function}]{density}
+(\determ[Wahrscheinlichkeits!-dichtefunktion]{Wahrscheinlichkeitsdichtefunktion},
+\determ[Wahrscheinlichkeits!-dichte]{Wahrscheinlichkeitsdichte}). The
+well known \entermde{Normalverteilung}{normal distribution} is an
+example of a probability density function
 \[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
 --- the \enterm{Gaussian distribution}
 (\determ{Gau{\ss}sche-Glockenkurve}) with mean $\mu$ and standard
 deviation $\sigma$.
-The factor in front of the exponential function ensures the normalization to
+The factor in front of the exponential function ensures normalization to
 $\int p_g(x) \, dx = 1$, \eqnref{pdfnorm}.
 
 \begin{exercise}{gaussianpdf.m}{gaussianpdf.out}
@@ -322,13 +330,15 @@ values fall within each bin (\figref{pdfhistogramfig} left).
 
 To turn such histograms to estimates of probability densities they
 need to be normalized such that according to \eqnref{pdfnorm} their
-integral equals one. While histograms of categorial data are
+integral equals one. While histograms of categorical data are
 normalized such that their sum equals one, here we need to integrate
 over the histogram. The integral is the area (not the height) of the
 histogram bars. Each bar has the height $n_i$ and the width $\Delta
 x$.  The total area $A$ of the histogram is thus
 \[ A = \sum_{i=1}^N ( n_i \cdot \Delta x ) = \Delta x \sum_{i=1}^N n_i = N \, \Delta x \]
-and the normalized histogram has the heights
+and the
+\entermde[histogram!normalized]{Histogramm!normiertes}{normalized
+  histogram} has the heights
 \[ p(x_i) = \frac{n_i}{A} = \frac{n_i}{\Delta x \sum_{i=1}^N n_i} =
    \frac{n_i}{N \Delta x} \; .\]
 A histogram needs to be divided by both the sum of the frequencies
@@ -375,14 +385,14 @@ shape histogram depends on the exact position of its bins
     (here Gaussian kernels with standard deviation of $\sigma=0.2$).}
 \end{figure}
 
-To avoid this problem one can use so called \enterm{kernel densities}
-for estimating probability densities from data. Here every data point
-is replaced by a kernel (a function with integral one, like for
-example the Gaussian) that is moved exactly to the position
-indicated by the data value. Then all the kernels of all the data
-values are summed up, the sum is divided by the number of data values,
-and we get an estimate of the probability density
-(\figref{kerneldensityfig} right).
+To avoid this problem so called \entermde[kernel
+density]{Kerndichte}{kernel densities} can be used for estimating
+probability densities from data. Here every data point is replaced by
+a kernel (a function with integral one, like for example the Gaussian)
+that is moved exactly to the position indicated by the data
+value. Then all the kernels of all the data values are summed up, the
+sum is divided by the number of data values, and we get an estimate of
+the probability density (\figref{kerneldensityfig} right).
 
 As for the histogram, where we need to choose a bin width, we need to
 choose the width of the kernels appropriately.
@@ -457,7 +467,9 @@ bivariate or multivariate data sets where we have pairs or tuples of
 data values (e.g. size and weight of elephants) we want to analyze
 dependencies between the variables.
 
-The \enterm[correlation!correlation coefficient]{correlation coefficient}
+The
+\entermde[correlation!coefficient]{Korrelation!-skoeffizient}{correlation
+  coefficient}
 \begin{equation}
   \label{correlationcoefficient}
   r_{x,y} = \frac{Cov(x,y)}{\sigma_x \sigma_y} = \frac{\langle
@@ -467,8 +479,8 @@ The \enterm[correlation!correlation coefficient]{correlation coefficient}
 \end{equation}
 quantifies linear relationships between two variables
 \matlabfun{corr()}.  The correlation coefficient is the
-\enterm{covariance} normalized by the standard deviations of the
-single variables.  Perfectly correlated variables result in a
+\entermde{Kovarianz}{covariance} normalized by the standard deviations
+of the single variables.  Perfectly correlated variables result in a
 correlation coefficient of $+1$, anit-correlated or negatively
 correlated data in a correlation coefficient of $-1$ and un-correlated
 data in a correlation coefficient close to zero