diff --git a/Makefile b/Makefile index a5054d5..a743d0a 100644 --- a/Makefile +++ b/Makefile @@ -22,9 +22,10 @@ $(BASENAME).pdf : $(BASENAME).tex header.tex $(SUBTEXS) splitindex $(BASENAME).idx index : - pdflatex -interaction=scrollmode $(BASENAME).tex + pdflatex $(BASENAME).tex splitindex $(BASENAME).idx - pdflatex -interaction=scrollmode $(BASENAME).tex | tee /dev/stderr | fgrep -q "Rerun to get cross-references right" && pdflatex $(BASENAME).tex || true + pdflatex $(BASENAME).tex + pdflatex $(BASENAME).tex again : pdflatex $(BASENAME).tex diff --git a/bootstrap/lecture/bootstrap.tex b/bootstrap/lecture/bootstrap.tex index f23f28d..b01b227 100644 --- a/bootstrap/lecture/bootstrap.tex +++ b/bootstrap/lecture/bootstrap.tex @@ -171,7 +171,7 @@ A good example for the application of a assessment of \entermde[correlation]{Korrelation}{correlations}. Given are measured pairs of data points $(x_i, y_i)$. By calculating the \entermde[correlation!correlation -coefficient]{Korrelation!Korrelationskoeffizient}{correlation +coefficient]{Korrelation!-skoeffizient}{correlation coefficient} we can quantify how strongly $y$ depends on $x$. The correlation coefficient alone, however, does not tell whether the correlation is significantly different from a random correlation. The diff --git a/codestyle/lecture/codestyle.tex b/codestyle/lecture/codestyle.tex index 9811d2f..73d5817 100644 --- a/codestyle/lecture/codestyle.tex +++ b/codestyle/lecture/codestyle.tex @@ -1,4 +1,4 @@ -\chapter{\tr{Code style}{Programmierstil}} +\chapter{Code style} \shortquote{Any code of your own that you haven't looked at for six or more months might as well have been written by someone @@ -33,7 +33,7 @@ by calling functions that work on the data and managing the results. Applying this structure makes it easy to understand the flow of the program but two questions remain: (i) How to organize the files on the file system and (ii) how to name them that the controlling -script is easily identified among the other \codeterm{m-files}. +script is easily identified among the other \codeterm[m-file]{m-files}. Upon installation \matlab{} creates a folder called \file{MATLAB} in the user space (Windows: My files, Linux: Documents, MacOS: @@ -43,7 +43,7 @@ moment. Of course, any other location can specified as well. Generally it is of great advantage to store related scripts and functions within the same folder on the hard drive. An easy approach is to create a project-specific folder structure that contains sub-folders for each -task (analysis) and to store all related \codeterm{m-files} +task (analysis) and to store all related \codeterm[m-file]{m-files} (screenshot \ref{fileorganizationfig}). In these task-related folders one may consider to create a further sub-folder to store results (created figures, result data). On the project level a single script @@ -307,8 +307,8 @@ and to briefly explain what they do. Whenever one feels tempted to do this, one could also consider to delegate the respective task to a function. In most cases this is preferable. -Not delegating the tasks leads to very long \codeterm{m-files} which -can be confusing. Sometimes such a code is called ``spaghetti +Not delegating the tasks leads to very long \codeterm[m-file]{m-files} +which can be confusing. Sometimes such a code is called ``spaghetti code''. It is high time to think about delegation of tasks to functions. @@ -323,17 +323,17 @@ functions. \end{important} \subsection{Local and nested functions} -Generally, functions live in their own \codeterm{m-files} that have -the same name as the function itself. Delegating tasks to functions -thus leads to a large set of \codeterm{m-files} which increases -complexity and may lead to confusion. If the delegated functionality -is used in multiple instances, it is still advisable to do so. On the -other hand, when the delegated functionality is only used within the -context of another function \matlab{} allows to define -\codeterm[function!local]{local functions} and -\codeterm[function!nested]{nested functions} within the same -file. Listing \ref{localfunctions} shows an example of a local -function definition. +Generally, functions live in their own \codeterm[m-file]{m-files} that +have the same name as the function itself. Delegating tasks to +functions thus leads to a large set of \codeterm[m-file]{m-files} +which increases complexity and may lead to confusion. If the delegated +functionality is used in multiple instances, it is still advisable to +do so. On the other hand, when the delegated functionality is only +used within the context of another function \matlab{} allows to define +\entermde[function!local]{Funktion!lokale}{local functions} and +\entermde[function!nested]{Funktion!verschachtelte}{nested functions} +within the same file. Listing \ref{localfunctions} shows an example of +a local function definition. \pagebreak[3] \lstinputlisting[label=localfunctions, caption={Example for local functions.}]{calculateSines.m} @@ -408,11 +408,12 @@ advisable to adhere to these. Repeated tasks should (to be read as must) be delegated to functions. In cases in which a function is only locally applied and not of more global interest across projects consider to define it as -\codeterm[function!local]{local function} or -\codeterm[function!nested]{nested function}. Taking care to increase -readability and comprehensibility pays off, even to the author! -\footnote{Reading tip: Robert C. Martin: \textit{Clean Code: A Handbook of - Agile Software Craftmanship}, Prentice Hall} +\entermde[function!local]{Funktion!lokale}{local function} or +\entermde[function!nested]{Funktion!verschachtelte}{nested + function}. Taking care to increase readability and comprehensibility +pays off, even to the author! \footnote{Reading tip: Robert + C. Martin: \textit{Clean Code: A Handbook of Agile Software + Craftmanship}, Prentice Hall} \shortquote{Programs must be written for people to read, and only incidentally for machines to execute.}{Abelson / Sussman} diff --git a/debugging/lecture/debugging.tex b/debugging/lecture/debugging.tex index e7d213e..6c70fab 100644 --- a/debugging/lecture/debugging.tex +++ b/debugging/lecture/debugging.tex @@ -49,13 +49,14 @@ obscure logical errors! Take care when using the \codeterm{try-catch \end{important} -\subsection{\codeterm{Syntax errors}}\label{syntax_error} -The most common and easiest to fix type of error. A syntax error -violates the rules (spelling and grammar) of the programming -language. For example every opening parenthesis must be matched by a -closing one or every \code{for} loop has to be closed by an -\code{end}. Usually, the respective error messages are clear and -the editor will point out and highlight most \codeterm{syntax error}s. +\subsection{Syntax errors}\label{syntax_error} +The most common and easiest to fix type of error. A +\entermde[error!syntax]{Fehler!Syntax\~}{syntax error} violates the +rules (spelling and grammar) of the programming language. For example +every opening parenthesis must be matched by a closing one or every +\code{for} loop has to be closed by an \code{end}. Usually, the +respective error messages are clear and the editor will point out and +highlight most syntax errors. \begin{lstlisting}[label=syntaxerror, caption={Unbalanced parenthesis error.}] >> mean(random_numbers @@ -66,8 +67,9 @@ Did you mean: >> mean(random_numbers) \end{lstlisting} -\subsection{\codeterm{Indexing error}}\label{index_error} -Second on the list of common errors are the indexing errors. Usually +\subsection{Indexing error}\label{index_error} +Second on the list of common errors are the +\entermde[error!indexing]{Fehler!Index\~}{indexing errors}. Usually \matlab{} gives rather precise infromation about the cause, once you know what they mean. Consider the following code. @@ -111,14 +113,16 @@ to a number and uses this number to address the element in \varcode{my\_array}. The \codeterm{char} has the ASCII code 65 and thus the 65th element of \varcode{my\_array} is returned. -\subsection{\codeterm{Assignment error}} -Related to the Indexing error, an assignment error occurs when we want -to write data into a variable, that does not fit into it. Listing -\ref{assignmenterror} shows the simple case for 1-d data but, of -course, it extents to n-dimensional data. The data that is to be -filled into a matrix hat to fit in all dimensions. The command in line -7 works due to the fact, that matlab automatically extends the matrix, -if you assign values to a range outside its bounds. +\subsection{Assignment error} +Related to the indexing error, an +\entermde[error!assignment]{Fehler!Zuweisungs\~}{assignment error} +occurs when we want to write data into a variable, that does not fit +into it. Listing \ref{assignmenterror} shows the simple case for 1-d +data but, of course, it extents to n-dimensional data. The data that +is to be filled into a matrix hat to fit in all dimensions. The +command in line 7 works due to the fact, that matlab automatically +extends the matrix, if you assign values to a range outside its +bounds. \begin{lstlisting}[label=assignmenterror, caption={Assignment errors.}] >> a = zeros(1, 100); @@ -133,21 +137,20 @@ ans = 110 1 \end{lstlisting} -\subsection{\codeterm{Dimension mismatch error}} +\subsection{Dimension mismatch error} Similarly, some arithmetic operations are only valid if the variables fulfill some size constraints. Consider the following commands (listing\,\ref{dimensionmismatch}). The first one (line 3) fails -because we are trying to do al elementwise add on two vectors that -have different lengths, respectively sizes. The matrix multiplication -in line 6 also fails since for this operations to succeed the inner -matrix dimensions must agree (for more information on the -matrixmultiplication see box\,\ref{matrixmultiplication} in -chapter\,\ref{programming}). The elementwise multiplication issued in -line 10 fails for the same reason as the addition we tried -earlier. Sometimes, however, things apparently work but the result may -be surprising. The last operation in listing\,\ref{dimensionmismatch} -does not throw an error but the result is something else than the -expected elementwise multiplication. +because we are trying to add two vectors of different lengths +elementwise. The matrix multiplication in line 6 also fails since for +this operations to succeed the inner matrix dimensions must agree (for +more information on the matrixmultiplication see +box\,\ref{matrixmultiplication} in chapter\,\ref{programming}). The +elementwise multiplication issued in line 10 fails for the same reason +as the addition we tried earlier. Sometimes, however, things +apparently work but the result may be surprising. The last operation +in listing\,\ref{dimensionmismatch} does not throw an error but the +result is something else than the expected elementwise multiplication. % XXX Some arithmetic operations make size constraints, violating them leads to dimension mismatch errors. \begin{lstlisting}[label=dimensionmismatch, caption={Dimension mismatch errors.}] @@ -174,7 +177,8 @@ expected elementwise multiplication. \section{Logical error} Sometimes a program runs smoothly and terminates without any complaint. This, however, does not necessarily mean that the program -is correct. We may have made a \codeterm{logical error}. Logical +is correct. We may have made a +\entermde[error!logical]{Fehler!logischer}{logical error}. Logical errors are hard to find, \matlab{} has no chance to detect such errors since they do not violate the syntax or cause the throwing of an error. Thus, we are on our own to find and fix the bug. There are a @@ -283,7 +287,7 @@ validity. Matlab offers a unit testing framework in which small scripts are written that test the features of the program. We will follow the example given in the \matlab{} help and assume that there is a -function \code{rightTriangle} (listing\,\ref{trianglelisting}). +function \varcode{rightTriangle()} (listing\,\ref{trianglelisting}). % XXX Slightly more readable version of the example given in the \matlab{} help system. Note: The variable name for the angles have been capitalized in order to not override the matlab defined functions \code{alpha, beta,} and \code{gamma}. \begin{lstlisting}[label=trianglelisting, caption={Example function for unit testing.}] @@ -308,7 +312,7 @@ folder that follows the following rules. \item The name of the script file must start or end with the word 'test', which is case-insensitive. \item Each unit test should be placed in a separate section/cell of the script. -\item After the \code{\%\%} that defines the cell, a name for the +\item After the \mcode{\%\%} that defines the cell, a name for the particular unit test may be given. \end{enumerate} @@ -328,11 +332,11 @@ Further there are a few things that are different in tests compared to normal sc tests. \end{enumerate} -The test script for the \code{rightTrianlge} function +The test script for the \varcode{rightTriangle()} function (listing\,\ref{trianglelisting}) may look like in listing\,\ref{testscript}. -\begin{lstlisting}[label=testscript, caption={Unit test for the \code{rightTriangle} function stored in an m-file testRightTriangle.m}] +\begin{lstlisting}[label=testscript, caption={Unit test for the \varcode{rightTriangle()} function stored in an m-file testRightTriangle.m}] tolerance = 1e-10; % preconditions @@ -372,7 +376,7 @@ assert(abs(approx - smallAngle) <= tolerance, 'Problem with small angle approxim In a test script we can execute any code. The actual test whether or not the results match our predictions is done using the -\code{assert()}{assert} function. This function basically expects a +\code{assert()} function. This function basically expects a boolean value and if this is not true, it raises an error that, in the context of the test does not lead to a termination of the program. In the tests above, the argument to assert is always a boolean expression @@ -392,7 +396,7 @@ result = runtests('testRightTriangle') During the run, \matlab{} will put out error messages onto the command line and a summary of the test results is then stored within the \varcode{result} variable. These can be displayed using the function -\code{table(result)} +\code[table()]{table(result)}. \begin{lstlisting}[label=testresults, caption={The test results.}, basicstyle=\ttfamily\scriptsize] table(result) @@ -431,7 +435,7 @@ that help to solve the problem. \item No idea what the error message is trying to say? Google it! \item Read the program line by line and understand what each line is doing. -\item Use \code{disp} to print out relevant information on the command +\item Use \code{disp()} to print out relevant information on the command line and compare the output with your expectations. Do this step by step and start at the beginning. \item Use the \matlab{} debugger to stop execution of the code at a diff --git a/header.tex b/header.tex index d626757..d33c7ac 100644 --- a/header.tex +++ b/header.tex @@ -217,7 +217,7 @@ % the english index. \newcommand{\enterm}[2][]{\textit{#2}\ifthenelse{\equal{#1}{}}{\protect\sindex[enterm]{#2}}{\protect\sindex[enterm]{#1}}} -% \endeterm[english index entry]{}{} +% \entermde[english index entry]{}{} % typeset the english term in italics and add it (or the first % optional argument) to the english index. In addition add the german % index entry to the german index without printing it. @@ -270,7 +270,7 @@ \newcommand{\pythonfun}[1]{(\tr{\python-function}{\python-Funktion} \varcode{#1})\protect\sindex[pcode]{#1}} % typeset '(matlab-function #1)' and add the function to the matlab index: -\newcommand{\matlabfun}[1]{(\tr{\matlab-function}{\matlab-Funktion} \varcode{#1})\protect\sindex[mcode]{#1}} +\newcommand{\matlabfun}[1]{(function \varcode{#1})\protect\sindex[mcode]{#1}} %%%%% shortquote and widequote commands: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% diff --git a/likelihood/lecture/likelihood.tex b/likelihood/lecture/likelihood.tex index 1994a8b..dcaafaa 100644 --- a/likelihood/lecture/likelihood.tex +++ b/likelihood/lecture/likelihood.tex @@ -26,15 +26,16 @@ parameters $\theta$. This could be the normal distribution defined by the mean $\mu$ and the standard deviation $\sigma$ as parameters $\theta$. If the $n$ independent observations of $x_1, x_2, \ldots x_n$ originate from the same probability density -distribution (they are \enterm{i.i.d.} independent and identically -distributed) then the conditional probability $p(x_1,x_2, \ldots +distribution (they are \enterm[i.i.d.|see{independent and identically +distributed}]{i.i.d.}, \enterm{independent and identically +distributed}) then the conditional probability $p(x_1,x_2, \ldots x_n|\theta)$ of observing $x_1, x_2, \ldots x_n$ given a specific $\theta$ is given by \begin{equation} p(x_1,x_2, \ldots x_n|\theta) = p(x_1|\theta) \cdot p(x_2|\theta) \ldots p(x_n|\theta) = \prod_{i=1}^n p(x_i|\theta) \; . \end{equation} -Vice versa, the \enterm{likelihood} of the parameters $\theta$ +Vice versa, the \entermde{Likelihood}{likelihood} of the parameters $\theta$ given the observed data $x_1, x_2, \ldots x_n$ is \begin{equation} {\cal L}(\theta|x_1,x_2, \ldots x_n) = p(x_1,x_2, \ldots x_n|\theta) \; . @@ -57,7 +58,7 @@ The position of a function's maximum does not change when the values of the function are transformed by a strictly monotonously rising function such as the logarithm. For numerical and reasons that we will discuss below, we commonly search for the maximum of the logarithm of -the likelihood (\enterm{log-likelihood}): +the likelihood (\entermde[likelihood!log-]{Likelihood!Log-}{log-likelihood}): \begin{eqnarray} \theta_{mle} & = & \text{argmax}_{\theta}\; {\cal L}(\theta|x_1,x_2, \ldots x_n) \nonumber \\ @@ -136,9 +137,10 @@ from the data. For non-Gaussian distributions (e.g. a Gamma-distribution), however, such simple analytical expressions for the parameters of the distribution do not exist, e.g. the shape parameter of a -\enterm{Gamma-distribution}. How do we fit such a distribution to -some data? That is, how should we compute the values of the parameters -of the distribution, given the data? +\entermde[distribution!Gamma-]{Verteilung!Gamma-}{Gamma-distribution}. How +do we fit such a distribution to some data? That is, how should we +compute the values of the parameters of the distribution, given the +data? A first guess could be to fit the probability density function by minimization of the squared difference to a histogram of the measured @@ -289,10 +291,10 @@ out of \eqnref{mleslope} and we get To see what this expression is, we need to standardize the data. We make the data mean free and normalize them to their standard deviation, i.e. $x \mapsto (x - \bar x)/\sigma_x$. The resulting -numbers are also called \enterm[z-values]{$z$-values} or $z$-scores and they -have the property $\bar x = 0$ and $\sigma_x = 1$. $z$-scores are -often used in Biology to make quantities that differ in their units -comparable. For standardized data the variance +numbers are also called \entermde[z-values]{z-Wert}{$z$-values} or +$z$-scores and they have the property $\bar x = 0$ and $\sigma_x = +1$. $z$-scores are often used in Biology to make quantities that +differ in their units comparable. For standardized data the variance \[ \sigma_x^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2 = \frac{1}{n} \sum_{i=1}^n x_i^2 = 1 \] is given by the mean squared data and equals one. The covariance between $x$ and $y$ also simplifies to diff --git a/plotting/lecture/plotting.tex b/plotting/lecture/plotting.tex index e3d3280..ae531ba 100644 --- a/plotting/lecture/plotting.tex +++ b/plotting/lecture/plotting.tex @@ -112,16 +112,15 @@ the missing information ourselves. Thus, we need a second variable that contains the respective \varcode{x} values. The length of \varcode{x} and \varcode{y} must be the same otherwise the later call of the \varcode{plot} function will raise an error. The respective -call will expand to \code[plot()]{plot(x, y)}. The x-axis will be now -be scaled from the minimum in \varcode{x} to the maximum of -\varcode{x} and by default it will be plotted as a line plot with a -solid blue line of the linewidth 1pt. A second plot that is added to the -figure will be plotted in red using the same settings. The -order of the used colors depends on the \enterm{colormap} settings -which can be adjusted to personal taste or -need. Table\,\ref{plotlinestyles} shows some predefined values that -can be chosen for the line style, the marker, or the color. For -additional options consult the help. +call will expand to \code[plot()]{plot(x, y)}. The x-axis will now be +scaled from the minimum in \varcode{x} to the maximum of \varcode{x} +and by default it will be plotted as a line plot with a solid blue +line of the linewidth 1pt. A second plot that is added to the figure +will be plotted in red using the same settings. The order of the used +colors depends on the \enterm{colormap} settings which can be adjusted +to personal taste or need. Table\,\ref{plotlinestyles} shows some +predefined values that can be chosen for the line style, the marker, +or the color. For additional options consult the help. \begin{table}[htp] \titlecaption{Predefined line styles (left), colors (center) and @@ -184,8 +183,8 @@ chosen. \subsection{Changing the axes properties} The first thing a plot needs are axis labels with correct units. By -calling the functions \code[xlabel]{xlabel('Time [ms]')} and -\code[ylabel]{ylabel('Voltage [mV]')} these can be set. By default the +calling the functions \code[xlabel()]{xlabel('Time [ms]')} and +\code[ylabel()]{ylabel('Voltage [mV]')} these can be set. By default the axes will be scaled to show the full extent of the data. The extremes will be selected as the closest integer for small values or the next full multiple of tens, hundreds, thousands, etc.\ depending on the @@ -196,8 +195,8 @@ functions expect a single argument, that is a 2-element vector containing the minimum and maximum value. Table\,\ref{plotaxisprops} lists some of the commonly adjusted properties of an axis. To set these properties, we need to have the axes object which can either be -stored in a variable when calling \varcode{plot} (\code{axes = - plot(x,y);}) or can be retrieved using the \code[gca]{gca} function +stored in a variable when calling \varcode{plot} (\varcode{axes = + plot(x,y);}) or can be retrieved using the \code{gca()} function (gca stands for ``get current axes''). Changing the properties of the axes object will update the plot (listing\,\ref{niceplotlisting}). @@ -253,8 +252,8 @@ and the placement of the axes on the paper. Table\,\ref{plotfigureprops} lists commonly used properties. For a complete reference check the help. To change the figure's appearance, we need to change the properties of the figure -object which can be retrieved during creation of the figure (\code{fig - = figure();}) or by using the \code{gcf} (``get current figure'') +object which can be retrieved during creation of the figure (\code[figure()]{fig + = figure();}) or by using the \code{gcf()} (``get current figure'') command. The script shown in the listing\,\ref{niceplotlisting} exemplifies @@ -334,10 +333,10 @@ the last one defines the output format (box\,\ref{graphicsformatbox}). properties could be read and set using the functions \code[get()]{get} and \code[set()]{set}. The first argument these functions expect are valid figure or axis \emph{handles} which were - returned by the \code{figure} and \code{plot} functions, or could be - retrieved using \code[gcf()]{gcf} or \code[gca()]{gca} for the + returned by the \code{figure()} and \code{plot()} functions, or could be + retrieved using \code{gcf()} or \code{gca()} for the current figure or axis handle, respectively. Subsequent arguments - passed to \code{set} are pairs of a property's name and the desired + passed to \code{set()} are pairs of a property's name and the desired value. \begin{lstlisting}[caption={Using set to change figure and axis properties.}] frequency = 5; % frequency of the sine wave in Hz @@ -351,8 +350,8 @@ the last one defines the output format (box\,\ref{graphicsformatbox}). set(figure_handle, 'PaperSize', [5.5, 5.5], 'PaperUnit', 'centimeters', ... 'PaperPosition', [0, 0, 5.5, 5.5]); \end{lstlisting} - With newer versions the handles returned by \varcode{gcf} and - \varcode{gca} are ``objects'' and setting properties became much + With newer versions the handles returned by \code{gcf()} and + \code{gca()} are ``objects'' and setting properties became much easier as it is used throughout this chapter. For downward compatibility with older versions set and get still work in current versions of \matlab{}. @@ -371,7 +370,7 @@ For some types of plots we present examples in the following sections. \subsection{Scatter} For displaying events or pairs of x-y coordinates the standard line -plot is not optimal. Rather, we use \code[scatter()]{scatter} for this +plot is not optimal. Rather, we use \code{scatter()} for this purpose. For example, we have a number of measurements of a system's response to a certain stimulus intensity. There is no dependency between the data points, drawing them with a line-plot would be @@ -417,8 +416,8 @@ A very common scenario is to combine several plots in the same figure. To do this we create so-called subplots figures\,\ref{regularsubplotsfig},\,\ref{irregularsubplotsfig}. The \code[subplot()]{subplot()} command allows to place multiple axes onto -a single sheet of paper. Generally, \varcode{subplot} expects three argument -defining the number of rows, column, and the currently active +a single sheet of paper. Generally, \code{subplot()} expects three +argument defining the number of rows, column, and the currently active plot. The currently active plot number starts with 1 and goes up to $rows \cdot columns$ (numbers in the subplots in figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}). @@ -439,7 +438,7 @@ figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}). By default, all subplots have the same size, if something else is desired, e.g.\ one subplot should span a whole row, while two others are smaller and should be placed side by side in the same row, the -third argument of \varcode{subplot} can be a vector or numbers that +third argument of \code{subplot()} can be a vector or numbers that should be joined. These have, of course, to be adjacent numbers (\figref{irregularsubplotsfig}, listing\,\ref{irregularsubplotslisting}). @@ -457,7 +456,7 @@ columns, need to be used in a plot. If you want to create something more elaborate, or have more spacing between the subplots one can create a grid with larger numbers of columns and rows, and specify the used cells of the grid by passing a vector as the third argument to -\varcode{subplot}. +\code{subplot()}. \lstinputlisting[caption={Script for creating subplots of different sizes \figref{irregularsubplotsfig}.}, @@ -498,12 +497,12 @@ more apt. Accordingly, four arguments are needed (line 12 in listing \ref{errorbarlisting}). The first two arguments are the same, the next to represent the positive and negative deflections. -By default the \code{errorbar} function does not draw a marker. In the +By default the \code{errorbar()} function does not draw a marker. In the examples shown here we provide extra arguments to define that a circle is used for that purpose. The line connecting the average values can be removed by passing additional arguments. The properties of the errorbars themselves (linestyle, linewidth, capsize, etc.) can be -changed by taking the return argument of \code{errorbar} and changing +changed by taking the return argument of \code{errorbar()} and changing its properties. See the \matlab{} help for more information. \begin{figure}[ht] @@ -530,18 +529,18 @@ areas instead of errorbars: In case you have a lot of data points with respective errorbars such that they would merge in the figure it is cleaner and probably easier to read and handle if one uses an error area instead. To achieve an illustration as shown in -figure\,\ref{errorbarplot} C, we use the \code{fill} command in +figure\,\ref{errorbarplot} C, we use the \code{fill()} command in combination with a standard line plot. The original purpose of -\code{fill} is to draw a filled polygon. We hence have to provide it +\code{fill()} is to draw a filled polygon. We hence have to provide it with the vertex points of the polygon. For each x-value we now have two y-values (average minus error and average plus error). Further, we want the vertices to be connected in a defined order. One can achieve this by going back and forth on the x-axis; we append a reversed -version of the x-values to the original x-values using \code{cat} and -\code{fliplr} for concatenation and inversion, respectively (line 3 in +version of the x-values to the original x-values using \code{cat()} and +\code{fliplr()} for concatenation and inversion, respectively (line 3 in listing \ref{errorbarlisting2}; Depending on the layout of your data you may need concatenate along a different dimension of the data and -use \code{flipud} instead). The y-coordinates of the polygon vertices +use \code{flipud()} instead). The y-coordinates of the polygon vertices are concatenated in a similar way (line 4). In the example shown here we accept the polygon object that is returned by fill (variable p) and use it to change a few properties of the polygon. The \emph{FaceAlpha} @@ -561,9 +560,9 @@ connecting the average values (line 12). The \code[text()]{text()} or \code[annotation()]{annotation()} are used for highlighting certain parts of a plot or simply adding an annotation that does not fit or does not belong into the legend. -While \varcode{text} simply prints out the given text string at the +While \code{text()} simply prints out the given text string at the defined position (for example line in -listing\,\ref{regularsubplotlisting}) the \varcode{annotation} +listing\,\ref{regularsubplotlisting}) the \code{annotation()} function allows to add some more advanced highlights like arrows, lines, ellipses, or rectangles. Figure\,\ref{annotationsplot} shows some examples, the respective code can be found in @@ -583,9 +582,9 @@ listing\,\ref{annotationsplotlisting}. For more options consult the \begin{important}[Positions in data or figure coordinates.] A very confusing pitfall are the different coordinate systems used - by \varcode{text} and \varcode{annotation}. While \varcode{text} + by \varcode{text()} and \varcode{annotation()}. While \varcode{text()} expects the positions to be in data coordinates, i.e.\,in the limits - of the x- and y-axis, \varcode{annotation} requires the positions to + of the x- and y-axis, \varcode{annotation()} requires the positions to be given in normalized figure coordinates. Normalized means that the width and height of the figure are expressed by numbers in the range 0 to 1. The bottom/left corner then has the coordinates $(0,0)$ and @@ -624,9 +623,9 @@ Lissajous figure. The basic steps are: is created and opened for writing. This also implies that is has to be closed after the whole process (line 31). \item For each frame of the video, we plot the appropriate data (we - use \code[scatter]{scatter} for this purpose, line 20) and ``grab'' + use \code{scatter()} for this purpose, line 20) and ``grab'' the frame (line 28). Grabbing is similar to making a screenshot of - the figure. The \code{drawnow}{drawnow} command (line 27) is used to + the figure. The \code{drawnow()} command (line 27) is used to stop the excution of the for loop until the drawing process is finished. \item Write the frame to file (line 29). diff --git a/pointprocesses/lecture/pointprocesses.tex b/pointprocesses/lecture/pointprocesses.tex index a8fae40..e658f8f 100644 --- a/pointprocesses/lecture/pointprocesses.tex +++ b/pointprocesses/lecture/pointprocesses.tex @@ -73,10 +73,10 @@ number of observed events within a certain time window $n_i$ (\figref{pointprocessscetchfig}). \begin{exercise}{rasterplot.m}{} - Implement a function \code{rasterplot()} that displays the times of - action potentials within the first \code{tmax} seconds in a raster + Implement a function \varcode{rasterplot()} that displays the times of + action potentials within the first \varcode{tmax} seconds in a raster plot. The spike times (in seconds) recorded in the individual trials - are stored as vectors of times within a \codeterm{cell-array}. + are stored as vectors of times within a \codeterm{cell array}. \end{exercise} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% @@ -95,10 +95,10 @@ describing the statistics of stochastic real-valued variables: \end{figure} \begin{exercise}{isis.m}{} - Implement a function \code{isis()} that calculates the interspike + Implement a function \varcode{isis()} that calculates the interspike intervals from several spike trains. The function should return a single vector of intervals. The spike times (in seconds) of each - trial are stored as vectors within a \codeterm{cell-array}. + trial are stored as vectors within a cell-array. \end{exercise} %\subsection{First order interval statistics} @@ -117,7 +117,7 @@ describing the statistics of stochastic real-valued variables: \end{itemize} \begin{exercise}{isihist.m}{} - Implement a function \code{isiHist()} that calculates the normalized + Implement a function \varcode{isiHist()} that calculates the normalized interspike interval histogram. The function should take two input arguments; (i) a vector of interspike intervals and (ii) the width of the bins used for the histogram. It further returns the @@ -126,7 +126,7 @@ describing the statistics of stochastic real-valued variables: \begin{exercise}{plotisihist.m}{} Implement a function that takes the return values of - \code{isiHist()} as input arguments and then plots the data. The + \varcode{isiHist()} as input arguments and then plots the data. The plot should show the histogram with the x-axis scaled to milliseconds and should be annotated with the average ISI, the standard deviation and the coefficient of variation. @@ -167,7 +167,7 @@ $\rho_k$ is usually plotted against the lag $k$ with itself and is always 1. \begin{exercise}{isiserialcorr.m}{} - Implement a function \code{isiserialcorr()} that takes a vector of + Implement a function \varcode{isiserialcorr()} that takes a vector of interspike intervals as input argument and calculates the serial correlation. The function should further plot the serial correlation. \pagebreak[4] @@ -213,12 +213,12 @@ time interval , \determ{Feuerrate}) that is given in Hertz % \end{figure} \begin{exercise}{counthist.m}{} - Implement a function \code{counthist()} that calculates and plots + Implement a function \varcode{counthist()} that calculates and plots the distribution of spike counts observed in a certain time window. The function should take two input arguments: (i) a - \codeterm{cell-array} of vectors containing the spike times in - seconds observed in a number of trials, and (ii) the duration of the - time window that is used to evaluate the counts.\pagebreak[4] + cell-array of vectors containing the spike times in seconds observed + in a number of trials, and (ii) the duration of the time window that + is used to evaluate the counts.\pagebreak[4] \end{exercise} @@ -244,7 +244,7 @@ In an \enterm[Poisson process!inhomogeneous]{inhomogeneous Poisson \lambda(t)$. \begin{exercise}{poissonspikes.m}{} - Implement a function \code{poissonspikes()} that uses a homogeneous + Implement a function \varcode{poissonspikes()} that uses a homogeneous Poisson process to generate events at a given rate for a certain duration and a number of trials. The rate should be given in Hertz and the duration of the trials is given in seconds. The function @@ -293,7 +293,7 @@ The homogeneous Poisson process has the following properties: \end{itemize} \begin{exercise}{hompoissonspikes.m}{} - Implement a function \code{hompoissonspikes()} that uses a + Implement a function \varcode{hompoissonspikes()} that uses a homogeneous Poisson process to generate spike events at a given rate for a certain duration and a number of trials. The rate should be given in Hertz and the duration of the trials is given in @@ -422,7 +422,7 @@ potentials (\figref{binpsthfig} top). The resulting histogram is then normalized with the bin width $W$ to yield the firing rate shown in the bottom trace of figure \ref{binpsthfig}. The above sketched process is equivalent to estimating the probability density. It is -possible to estimate the PSTH using the \code{hist()} method +possible to estimate the PSTH using the \code{hist()} function \sindex[term]{Feuerrate!Binningmethode} The estimated firing rate is valid for the total duration of each diff --git a/programming/lecture/programming.tex b/programming/lecture/programming.tex index 3e27b35..1137832 100644 --- a/programming/lecture/programming.tex +++ b/programming/lecture/programming.tex @@ -112,7 +112,7 @@ x y z \begin{important}[Naming conventions] There are a few rules regarding variable names. \matlab{} is - case-sensitive, i.e. \code{x} and \code{X} are two different + case-sensitive, i.e. \varcode{x} and \varcode{X} are two different names. Names must begin with an alphabetic character. German (or other) umlauts, special characters and spaces are forbidden in variable names. @@ -689,9 +689,11 @@ then compare it to the elements on each page, and so on. An alternative way is to make use of the so called \emph{linear indexing} in which each element of the matrix is addressed by a single number. The linear index thus ranges from 1 to -\code{numel(matrix)}. The linear index increases first along the 1st, -2nd, 3rd etc. dimension (figure~\ref{matrixlinearindexingfig}). It is -not as intuitive since one would need to know the shape of the matrix and perform a remapping, but can be really helpful +\code[numel()]{numel(matrix)}. The linear index increases first along +the 1st, 2nd, 3rd etc. dimension +(figure~\ref{matrixlinearindexingfig}). It is not as intuitive since +one would need to know the shape of the matrix and perform a +remapping, but can be really helpful (listing~\ref{matrixLinearIndexing}). @@ -882,10 +884,11 @@ table~\ref{logicaloperators}) which are introduced in the following sections. \subsection{Relational operators} -With \codeterm[Operator!relational]{relational operators} (table~\ref{relationaloperators}) -we can ask questions such as: ''Is the value of variable \code{a} -larger than the value of \code{b}?'' or ``Is the value in \code{a} -equal to the one stored in variable \code{b}?''. +With \codeterm[Operator!relational]{relational operators} +(table~\ref{relationaloperators}) we can ask questions such as: ''Is +the value of variable \varcode{a} larger than the value of +\varcode{b}?'' or ``Is the value in \varcode{a} equal to the one +stored in variable \varcode{b}?''. \begin{table}[h!] \titlecaption{\label{relationaloperators} @@ -930,13 +933,13 @@ Testing the relations between numbers and scalar variables is straight forward. When comparing vectors, the relational operator will be applied element-wise and compare the respective elements of the left-hand-side and right-hand-side vectors. Note: vectors must have -the same length and orientation. The result of \code{[2 0 0 5 0] == [1 +the same length and orientation. The result of \varcode{[2 0 0 5 0] == [1 0 3 2 0]'} in which the second vector is transposed to give a column vector is a matrix! \subsection{Logical operators} With the relational operators we could for example test whether a -number is greater than a certain threshold (\code{x > 0.25}). But what +number is greater than a certain threshold (\varcode{x > 0.25}). But what if we wanted to check whether the number falls into the range greater than 0.25 but less than 0.75? Numbers that fall into this range must satisfy the one and the other condition. With @@ -1068,13 +1071,13 @@ values stored in a vector or matrix. It is very powerful and, once understood, very intuitive. The basic concept is that applying a Boolean operation on a vector -results in a \code{logical} vector of the same size (see +results in a \codeterm{logical} vector of the same size (see listing~\ref{logicaldatatype}). This logical vector is then used to select only those values for which the logical vector is true. Line 14 in listing~\ref{logicalindexing1} can be read: ``Select all those elements of \varcode{x} where the Boolean expression \varcode{x < 0} evaluates to true and store the result in the variable -\emph{x\_smaller\_zero}''. +\varcode{x\_smaller\_zero}''. \begin{lstlisting}[caption={Logical indexing.}, label=logicalindexing1] >> x = randn(1, 6) % a vector with 6 random numbers @@ -1154,13 +1157,14 @@ segment of data of a certain time span (the stimulus was on, data and metadata in a single variable. \textbf{Cell arrays} Arrays of variables that contain different - types. Unlike structures, the entries of a \codeterm{Cell array} are - not named. Indexing in \codeterm{Cell arrays} requires a special - operator the \code{\{\}}. \matlab{} uses \codeterm{Cell arrays} for - example when strings of different lengths should be stored in the - same variable: \varcode{months = \{'Januar', 'February', 'March', - 'April', 'May', 'Jun'\};}. Note the curly braces that are used to - create the array and are also used for indexing. + types. Unlike structures, the entries of a \codeterm{cell array} are + not named. Indexing in \codeterm[cell array]{cell arrays} requires a + special operator the \code{\{\}}. \matlab{} uses \codeterm[cell + array]{cell arrays} for example when strings of different lengths + should be stored in the same variable: \varcode{months = \{'Januar', + 'February', 'March', 'April', 'May', 'Jun'\};}. Note the curly + braces that are used to create the array and are also used for + indexing. \textbf{Tables} Tabular structure that allows to have columns of varying type combined with a header (much like a spreadsheet). @@ -1170,8 +1174,8 @@ segment of data of a certain time span (the stimulus was on, irregular intervals togehter with the measurement time in a single variable. Without the \codeterm{Timetable} data type at least two variables (one storing the time, the other the measurement) would be - required. \codeterm{Timetables} offer specific convenience functions - to work with timestamps. + required. \codeterm[Timetable]{Timetables} offer specific + convenience functions to work with timestamps. \textbf{Maps} In a \codeterm{map} a \codeterm{value} is associated with an arbitrary \codeterm{key}. The \codeterm{key} is not @@ -1246,7 +1250,7 @@ All imperative programming languages offer a solution: the loop. It is used whenever the same commands have to be repeated. -\subsubsection{The \code{for} --- loop} +\subsubsection{The \varcode{for} --- loop} The most common type of loop is the \codeterm{for-loop}. It consists of a \codeterm[Loop!head]{head} and the \codeterm[Loop!body]{body}. The head defines how often the code in the @@ -1258,7 +1262,7 @@ next value of this vector. In the body of the loop any code can be executed which may or may not use the running variable for a certain purpose. The \code{for} loop is closed with the keyword \code{end}. Listing~\ref{looplisting} shows a simple version of such a -\code{for} loop. +\codeterm{for-loop}. \begin{lstlisting}[caption={Example of a \varcode{for}-loop.}, label=looplisting] >> for x = 1:3 % head @@ -1273,15 +1277,15 @@ purpose. The \code{for} loop is closed with the keyword \begin{exercise}{factorialLoop.m}{factorialLoop.out} - Can we solve the factorial with a for-loop? Implement a for loop that - calculates the factorial of a number \varcode{n}. + Can we solve the factorial with a \varcode{for}-loop? Implement a + for loop that calculates the factorial of a number \varcode{n}. \end{exercise} \subsubsection{The \varcode{while} --- loop} -The \code{while}--loop is the second type of loop that is available in -almost all programming languages. Other, than the \code{for} -- loop, +The \codeterm{while-loop} is the second type of loop that is available in +almost all programming languages. Other, than the \codeterm{for-loop}, that iterates with the running variable over a vector, the while loop uses a Boolean expression to determine when to execute the code in it's body. The head of the loop starts with the keyword \code{while} @@ -1289,22 +1293,22 @@ that is followed by a Boolean expression. If this can be evaluated to true, the code in the body is executed. The loop is closed with an \code{end}. -\begin{lstlisting}[caption={Basic structure of a \code{while} loop.}, label=whileloop] +\begin{lstlisting}[caption={Basic structure of a \varcode{while} loop.}, label=whileloop] while x == true % head with a Boolean expression % execute this code if the expression yields true end \end{lstlisting} \begin{exercise}{factorialWhileLoop.m}{} - Implement the factorial of a number \varcode{n} using a \code{while} - -- loop. + Implement the factorial of a number \varcode{n} using a \varcode{while}-loop. \end{exercise} \begin{exercise}{neverendingWhile.m}{} - Implement a \code{while}--loop that is never-ending. Hint: the body - is executed as long as the Boolean expression in the head is - \code{true}. You can escape the loop by pressing \keycode{Ctrl+C}. + Implement a \varcode{while}-loop that is never-ending. Hint: the + body is executed as long as the Boolean expression in the head is + \varcode{true}. You can escape the loop by pressing + \keycode{Ctrl+C}. \end{exercise} @@ -1312,15 +1316,15 @@ end \begin{itemize} \item Both execute the code in the body iterative. -\item When using a \code{for} -- loop the body of the loop is executed +\item When using a \code{for}-loop the body of the loop is executed at least once (except when the vector used in the head is empty). -\item In a \code{while} -- loop, the body is not necessarily +\item In a \code{while}-loop, the body is not necessarily executed. It is entered only if the Boolean expression in the head yields true. -\item The \code{for} -- loop is best suited for cases in which the +\item The \code{for}-loop is best suited for cases in which the elements of a vector have to be used for a computation or when the number of iterations is known. -\item The \code{while} -- loop is best suited for cases when it is not +\item The \code{while}-loop is best suited for cases when it is not known in advance how often a certain piece of code has to be executed. \item Any problem that can be solved with one type can also be solve @@ -1336,8 +1340,8 @@ is only executed under a certain condition. \subsubsection{The \varcode{if} -- statement} The most prominent representative of the conditional expressions is -the \code{if} statement (sometimes also called \code{if - else} -statement). It constitutes a kind of branching point. It allows to +the \codeterm{if statement} (sometimes also called \codeterm{if - else +statement}). It constitutes a kind of branching point. It allows to control which branch of the code is executed. Again, the statement consists of the head and the body. The head @@ -1346,11 +1350,11 @@ that controls whether or not the body is entered. Optionally, the body can be either ended by the \code{end} keyword or followed by additional statements \code{elseif}, which allows to add another Boolean expression and to catch another condition or the \code{else} -the provide a default case. The last body of the \code{if - elseif - +the provide a default case. The last body of the \varcode{if - elseif - else} statement has to be finished with the \code{end} (listing~\ref{ifelselisting}). -\begin{lstlisting}[label=ifelselisting, caption={Structure of an \code{if} statement.}] +\begin{lstlisting}[label=ifelselisting, caption={Structure of an \varcode{if} statement.}] if x < y % head % body I, executed only if x < y elseif x > y @@ -1361,7 +1365,7 @@ end \end{lstlisting} \begin{exercise}{ifelse.m}{} - Draw a random number and check with an appropriate \code{if} + Draw a random number and check with an appropriate \varcode{if} statement whether it is \begin{enumerate} \item less than 0.5. @@ -1373,9 +1377,9 @@ end \subsubsection{The \varcode{switch} -- statement} -The \code{switch} statement is used whenever a set of conditions +The \codeterm{switch statement} is used whenever a set of conditions requires separate treatment. The statement is initialized with the -\code{switch} keyword that is followed by \emph{switch expression} (a +\code{switch} keyword that is followed by a \emph{switch expression} (a number or string). It is followed by a set of \emph{case expressions} which start with the keyword \code{case} followed by the condition that defines against which the \emph{switch expression} is tested. It @@ -1412,7 +1416,7 @@ end \end{itemize} -\subsection{The keywords \code{break} and \code{continue}} +\subsection{The keywords \varcode{break} and \varcode{continue}} Whenever the execution of a loop should be ended or if you want to skip the execution of the body under certain circumstances, one can @@ -1458,7 +1462,7 @@ end has passed between the calls of \code{tic} and \code{toc}. \begin{enumerate} - \item Use a \code{for} loop to select matching values. + \item Use a \varcode{for} loop to select matching values. \item Use logical indexing. \end{enumerate} \end{exercise} @@ -1486,12 +1490,12 @@ and executed line-by-line from top to bottom. \matlab{} knows three types of programs: \begin{enumerate} -\item \codeterm[Script]{Scripts} -\item \codeterm[Function]{Functions} -\item \codeterm[Object]{Objects} (not covered here) +\item \entermde[script]{Skripte}{Scripts} +\item \entermde[function]{Funktion}{Functions} +\item \entermde[Object]{Objekte}{Objects} (not covered here) \end{enumerate} -Programs are stored in so called \codeterm{m-files} +Programs are stored in so called \codeterm[m-file]{m-files} (e.g. \file{myProgram.m}). To use them they have to be \emph{called} from the command line or from within another program. Storing your code in programs increases the re-usability. So far we have used @@ -1507,13 +1511,13 @@ and if it now wants to read the previously stored variable, it will contain a different value than expected. Bugs like this are hard to find since each of the programs alone is perfectly fine and works as intended. A solution for this problem are the -\codeterm[Function]{functions}. +\entermde[function]{Funktion}{functions}. \subsection{Functions} Functions in \matlab{} are similar to mathematical functions \[ y = f(x) \] Here, the mathematical function has the name $f$ and it -has one \codeterm{argument} $x$ that is transformed into the +has one \entermde{Argument}{argument} $x$ that is transformed into the function's output value $y$. In \matlab{} the syntax of a function declaration is very similar (listing~\ref{functiondefinitionlisting}). @@ -1524,12 +1528,12 @@ function [y] = functionName(arg_1, arg_2) \end{lstlisting} The keyword \code{function} is followed by the return value(s) (it can -be a list \code{[]} of values), the function name and the +be a list \varcode{[]} of values), the function name and the argument(s). The function head is then followed by the function's body. A function is ended by and \code{end} (this is in fact optional but we will stick to this). Each function that should be directly used by the user (or called from other programs) should reside in an -individual \code{m-file} that has the same name as the function. By +individual \codeterm{m-file} that has the same name as the function. By using functions instead of scripts we gain several advantages: \begin{itemize} \item Encapsulation of program code that solves a certain task. It can @@ -1566,10 +1570,9 @@ function myFirstFunction() % function head end \end{lstlisting} -\code{myFirstFunction} (listing~\ref{badsinewavelisting}) is a +\varcode{myFirstFunction} (listing~\ref{badsinewavelisting}) is a prime-example of a bad function. There are several issues with it's design: - \begin{itemize} \item The function's name does not tell anything about it's purpose. \item The function is made for exactly one use-case (frequency of @@ -1594,7 +1597,7 @@ defined: (e.g. the user of another program that calls a function)? \end{enumerate} -As indicated above the \code{myFirstFunction} does three things at +As indicated above the \varcode{myFirstFunction} does three things at once, it seems natural, that the task should be split up into three parts. (i) Calculation of the individual sine waves defined by the frequency and the amplitudes (ii) graphical display of the data and @@ -1607,17 +1610,18 @@ define (i) how to name the function, (ii) which information it needs (arguments), and (iii) what it should return to the caller. \begin{enumerate} -\item \codeterm[Function!Name]{Name}: the name should be descriptive +\item \entermde[function!name]{Funktion!-sname}{Name}: the name should be descriptive of the function's purpose, i.e. the calculation of a sine wave. A - appropriate name might be \code{sinewave()}. -\item \codeterm[Function!Arguments]{Arguments}: What information does - the function need to do the calculation? There are obviously the - frequency as well as the amplitude. Further we may want to be able - to define the duration of the sine wave and the temporal - resolution. We thus need four arguments which should also named to - describe their content: \code{amplitude, frequency, t\_max,} and - \code{t\_step} might be good names. -\item \codeterm[Function!Return values]{Return values}: For a correct + appropriate name might be \varcode{sinewave()}. +\item \entermde[function!arguments]{Funktion!-sargument}{Arguments}: + What information does the function need to do the calculation? There + are obviously the frequency as well as the amplitude. Further we may + want to be able to define the duration of the sine wave and the + temporal resolution. We thus need four arguments which should also + named to describe their content: \varcode{amplitude}, + \varcode{frequency}, \varcode{t\_max}, and \varcode{t\_step} might + be good names. +\item \entermde[function!return values]{Funktion!R\"uckgabewerte}{Return values}: For a correct display of the data we need two vectors. The time, and the sine wave itself. We just need two return values: \varcode{time}, \varcode{sine} \end{enumerate} @@ -1657,11 +1661,11 @@ specification of the function: \begin{enumerate} \item It should plot a single sine wave. But it is not limited to sine - waves. It's name is thus: \code{plotFunction()}. + waves. It's name is thus: \varcode{plotFunction()}. \item What information does it need to solve the task? The - to-be-plotted data as there is the values \code{y\_data} and the - corresponding \code{x\_data}. As we want to plot series of sine - waves we might want to have a \code{name} for each function to be + to-be-plotted data as there is the values \varcode{y\_data} and the + corresponding \varcode{x\_data}. As we want to plot series of sine + waves we might want to have a \varcode{name} for each function to be displayed in the figure legend. \item Are there any return values? No, this function is just made for plotting, we do not need to return anything. @@ -1699,11 +1703,11 @@ Again, we need to specify what needs to be done: appropriate name for the script (that is the name of the m-file) might be \file{plotMultipleSinewaves.m}. \item What information do we need? we need to define the - \code{frequency}, the range of \code{amplitudes}, the - \code{duration} of the sine waves, and the temporal resolution given - as the time between to points in time, i.e. the \code{stepsize}. + \varcode{frequency}, the range of \varcode{amplitudes}, the + \varcode{duration} of the sine waves, and the temporal resolution given + as the time between to points in time, i.e. the \varcode{stepsize}. \item We then need to create an empty figure, and work through the - rang of \code{amplitudes}. We must not forget to switch \code{hold + rang of \varcode{amplitudes}. We must not forget to switch \varcode{hold on} if we want to see all the sine waves in one plot. \end{enumerate} diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex index 74c1392..7e020ae 100644 --- a/regression/lecture/regression.tex +++ b/regression/lecture/regression.tex @@ -33,7 +33,7 @@ fitting approaches. We will apply this method to find the combination of slope and intercept that best describes the system. -\section{The error function --- mean square error} +\section{The error function --- mean squared error} Before the optimization can be done we need to specify what is considered an optimal fit. In our example we search the parameter @@ -57,25 +57,23 @@ $\sum_{i=1}^N |y_i - y^{est}_i|$. The total error can only be small if all deviations are indeed small no matter if they are above or below the prediced line. Instead of the sum we could also ask for the \emph{average} - \begin{equation} \label{meanabserror} f_{dist}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N |y_i - y^{est}_i| \end{equation} should be small. Commonly, the \enterm{mean squared distance} oder -\enterm{mean squared error} +\enterm[square error!mean]{mean square error} (\determ[quadratischer Fehler!mittlerer]{mittlerer quadratischer Fehler}) \begin{equation} \label{meansquarederror} f_{mse}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N (y_i - y^{est}_i)^2 \end{equation} - is used (\figref{leastsquareerrorfig}). Similar to the absolute distance, the square of the error($(y_i - y_i^{est})^2$) is always positive error values do not cancel out. The square further punishes large deviations. \begin{exercise}{meanSquareError.m}{}\label{mseexercise}% - Implement a function \code{meanSquareError()}, that calculates the + Implement a function \varcode{meanSquareError()}, that calculates the \emph{mean square distance} between a vector of observations ($y$) and respective predictions ($y^{est}$). \end{exercise} @@ -84,18 +82,19 @@ large deviations. \section{\tr{Objective function}{Zielfunktion}} $f_{cost}(\{(x_i, y_i)\}|\{y^{est}_i\})$ is a so called -\enterm{objective function} or \enterm{cost function}. We aim to adapt -the model parameters to minimize the error (mean square error) and -thus the \emph{objective function}. In Chapter~\ref{maximumlikelihoodchapter} -we will show that the minimization of the mean square error is -equivalent to maximizing the likelihood that the observations -originate from the model (assuming a normal distribution of the data -around the model prediction). +\enterm{objective function} or \enterm{cost function} +(\determ{Kostenfunktion}). We aim to adapt the model parameters to +minimize the error (mean square error) and thus the \emph{objective + function}. In Chapter~\ref{maximumlikelihoodchapter} we will show +that the minimization of the mean square error is equivalent to +maximizing the likelihood that the observations originate from the +model (assuming a normal distribution of the data around the model +prediction). \begin{figure}[t] \includegraphics[width=1\textwidth]{linear_least_squares} \titlecaption{Estimating the \emph{mean square error}.} {The - deviation (\enterm{error}, orange) between the prediction (red + deviation error, orange) between the prediction (red line) and the observations (blue dots) is calculated for each data point (left). Then the deviations are squared and the aveage is calculated (right).} @@ -119,11 +118,13 @@ Replacing $y^{est}$ with the linear equation (the model) in That is, the mean square error is given the pairs $(x_i, y_i)$ and the parameters $m$ and $b$ of the linear equation. The optimization -process will not try to optimize $m$ and $b$ to lead to the smallest -error, the method of the \enterm{least square error}. +process tries to optimize $m$ and $b$ such that the error is +minimized, the method of the \enterm[square error!least]{least square + error} (\determ[quadratischer Fehler!kleinster]{Methode der + kleinsten Quadrate}). \begin{exercise}{lsqError.m}{} - Implement the objective function \code{lsqError()} that applies the + Implement the objective function \varcode{lsqError()} that applies the linear equation as a model. \begin{itemize} \item The function takes three arguments. The first is a 2-element @@ -131,7 +132,7 @@ error, the method of the \enterm{least square error}. \varcode{b}. The second is a vector of x-values the third contains the measurements for each value of $x$, the respecive $y$-values. \item The function returns the mean square error \eqnref{mseline}. - \item The function should call the function \code{meanSquareError()} + \item The function should call the function \varcode{meanSquareError()} defined in the previouos exercise to calculate the error. \end{itemize} \end{exercise} @@ -165,7 +166,7 @@ third dimension is used to indicate the error value \varcode{y}). Implement a script \file{errorSurface.m}, that calculates the mean square error between data and a linear model and illustrates the error surface using the \code{surf()} function - (consult the help to find out how to use \code{surf}.). + (consult the help to find out how to use \code{surf()}.). \end{exercise} By looking at the error surface we can directly see the position of @@ -257,7 +258,7 @@ way to the minimum of the objective function. The ball will always follow the steepest slope. Thus we need to figure out the direction of the steepest slope at the position of the ball. -The \enterm{gradient} (Box~\ref{partialderivativebox}) of the +The \entermde{Gradient}{gradient} (Box~\ref{partialderivativebox}) of the objective function is the vector \[ \nabla f_{cost}(m,b) = \left( \frac{\partial f(m,b)}{\partial m}, @@ -296,7 +297,7 @@ choose the opposite direction. \end{figure} \begin{exercise}{lsqGradient.m}{}\label{gradientexercise}% - Implement a function \code{lsqGradient()}, that takes the set of + Implement a function \varcode{lsqGradient()}, that takes the set of parameters $(m, b)$ of the linear equation as a two-element vector and the $x$- and $y$-data as input arguments. The function should return the gradient at that position. @@ -316,8 +317,8 @@ choose the opposite direction. Finally, we are able to implement the optimization itself. By now it should be obvious why it is called the gradient descent method. All ingredients are already there. We need: 1. The error function -(\code{meanSquareError}), 2. the objective function -(\code{lsqError()}), and 3. the gradient (\code{lsqGradient()}). The +(\varcode{meanSquareError}), 2. the objective function +(\varcode{lsqError()}), and 3. the gradient (\varcode{lsqGradient()}). The algorithm of the gradient descent is: \begin{enumerate} diff --git a/scientificcomputing-script.tex b/scientificcomputing-script.tex index 35d78fc..65f1a22 100644 --- a/scientificcomputing-script.tex +++ b/scientificcomputing-script.tex @@ -9,6 +9,8 @@ \include{#1/lecture/#1}% } +%\includeonly{regression/lecture/regression} + %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} diff --git a/statistics/lecture/statistics.tex b/statistics/lecture/statistics.tex index 642aa49..1af9ef6 100644 --- a/statistics/lecture/statistics.tex +++ b/statistics/lecture/statistics.tex @@ -7,17 +7,16 @@ Descriptive statistics characterizes data sets by means of a few measures. In addition to histograms that estimate the full distribution of the data, the following measures are used for characterizing univariate data: \begin{description} -\item[Location, central tendency] (``Lagema{\ss}e''): - arithmetic mean, median, mode. -\item[Spread, dispersion] (``Streuungsma{\ss}e''): variance, - standard deviation, inter-quartile range,\linebreak coefficient of variation - (``Variationskoeffizient''). -\item[Shape]: skewness (``Schiefe''), kurtosis (``W\"olbung''). +\item[Location, central tendency] (\determ{Lagema{\ss}e}): + \entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean}, \entermde{Median}{median}, \enterm{mode}. +\item[Spread, dispersion] (\determ{Streuungsma{\ss}e}): \entermde{Varianz}{variance}, + \entermde{Standardabweichung}{standard deviation}, inter-quartile range,\linebreak \enterm{coefficient of variation} (\determ{Variationskoeffizient}). +\item[Shape]: \enterm{skewness} (\determ{Schiefe}), \enterm{kurtosis} (\determ{W\"olbung}). \end{description} For bivariate and multivariate data sets we can also analyse their \begin{description} -\item[Dependence, association] (``Zusammenhangsma{\ss}e''): Pearson's correlation coefficient, - Spearman's rank correlation coefficient. +\item[Dependence, association] (\determ{Zusammenhangsma{\ss}e}): \entermde[correlation!coefficient!Pearson's]{Korrelation!Pearson}{Pearson's correlation coefficient}, + \entermde[correlation!coefficient!Spearman's rank]{{Rangkorrelationskoeffizient!Spearman'scher}}{Spearman's rank correlation coefficient}. \end{description} The following is in no way a complete introduction to descriptive @@ -26,15 +25,16 @@ daily data-analysis problems. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \section{Mean, variance, and standard deviation} -The \enterm{arithmetic mean} is a measure of location. For $n$ data values -$x_i$ the arithmetic mean is computed by +The \entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean} +is a measure of location. For $n$ data values $x_i$ the arithmetic +mean is computed by \[ \bar x = \langle x \rangle = \frac{1}{N}\sum_{i=1}^n x_i \; . \] This computation (summing up all elements of a vector and dividing by the length of the vector) is provided by the function \mcode{mean()}. The mean has the same unit as the data values. The dispersion of the data values around the mean is quantified by -their \enterm{variance} +their \entermde{Varianz}{variance} \[ \sigma^2_x = \langle (x-\langle x \rangle)^2 \rangle = \frac{1}{N}\sum_{i=1}^n (x_i - \bar x)^2 \; . \] The variance is computed by the function \mcode{var()}. The unit of the variance is the unit of the data values squared. @@ -42,14 +42,15 @@ Therefore, variances cannot be compared to the mean or the data values themselves. In particular, variances cannot be used for plotting error bars along with the mean. -The standard deviation -\[ \sigma_x = \sqrt{\sigma^2_x} \; , \] -as computed by the function \mcode{std()}, however, has the same unit -as the data values and can (and should) be used to display the -dispersion of the data together with their mean. +In contrast to the variance, the +\entermde{Standardabweichung}{standard deviation} +\[ \sigma_x = \sqrt{\sigma^2_x} \; , \] +as computed by the function \mcode{std()} has the same unit as the +data values and can (and should) be used to display the dispersion of +the data together with their mean. The mean of a data set can be displayed by a bar-plot -\matlabfun{bar()}. Additional errorbars \matlabfun{errobar()} can be +\matlabfun{bar()}. Additional errorbars \matlabfun{errorbar()} can be used to illustrate the standard deviation of the data (\figref{displayunivariatedatafig} (2)). @@ -90,18 +91,18 @@ used to illustrate the standard deviation of the data identical with the mode.} \end{figure} -The \enterm{mode} is the most frequent value, i.e. the position of the maximum of the probability distribution. +The \enterm{mode} (\determ{Modus}) is the most frequent value, +i.e. the position of the maximum of the probability distribution. -The \enterm{median} separates a list of data values into two halves -such that one half of the data is not greater and the other half is -not smaller than the median (\figref{medianfig}). +The \entermde{Median}{median} separates a list of data values into two +halves such that one half of the data is not greater and the other +half is not smaller than the median (\figref{medianfig}). The +function \mcode{median()} computes the median. \begin{exercise}{mymedian.m}{} Write a function \varcode{mymedian()} that computes the median of a vector. \end{exercise} -\matlab{} provides the function \code{median()} for computing the median. - \begin{exercise}{checkmymedian.m}{} Write a script that tests whether your median function really returns a median above which are the same number of data than @@ -122,9 +123,9 @@ not smaller than the median (\figref{medianfig}). \end{figure} The distribution of data can be further characterized by the position -of its \enterm[quartile]{quartiles}. Neighboring quartiles are +of its \entermde[quartile]{Quartil}{quartiles}. Neighboring quartiles are separated by 25\,\% of the data (\figref{quartilefig}). -\enterm[percentile]{Percentiles} allow to characterize the +\entermde[percentile]{Perzentil}{Percentiles} allow to characterize the distribution of the data in more detail. The 3$^{\rm rd}$ quartile corresponds to the 75$^{\rm th}$ percentile, because 75\,\% of the data are smaller than the 3$^{\rm rd}$ quartile. @@ -147,11 +148,12 @@ data are smaller than the 3$^{\rm rd}$ quartile. % from a normal distribution.} % \end{figure} -\enterm[box-whisker plots]{Box-whisker plots} are commonly used to -visualize and compare the distribution of unimodal data. A box is -drawn around the median that extends from the 1$^{\rm st}$ to the -3$^{\rm rd}$ quartile. The whiskers mark the minimum and maximum value -of the data set (\figref{displayunivariatedatafig} (3)). +\entermde[box-whisker plots]{Box-Whisker-Plot}{Box-whisker plots}, or +\entermde{Box-Plot}{box plot} are commonly used to visualize and +compare the distribution of unimodal data. A box is drawn around the +median that extends from the 1$^{\rm st}$ to the 3$^{\rm rd}$ +quartile. The whiskers mark the minimum and maximum value of the data +set (\figref{displayunivariatedatafig} (3)). \begin{exercise}{univariatedata.m}{} Generate 40 normally distributed random numbers with a mean of 2 and @@ -170,13 +172,14 @@ of the data set (\figref{displayunivariatedatafig} (3)). % \end{exercise} \section{Distributions} -The distribution of values in a data set is estimated by histograms -(\figref{displayunivariatedatafig} (4)). +The \enterm{distribution} (\determ{Verteilung}) of values in a data +set is estimated by histograms (\figref{displayunivariatedatafig} +(4)). \subsection{Histograms} -\enterm[histogram]{Histograms} count the frequency $n_i$ of -$N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$ +\entermde[histogram]{Histogramm}{Histograms} count the frequency $n_i$ +of $N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$ (\figref{diehistogramsfig} left). The bins tile the data range usually into intervals of the same size. The width of the bins is called the bin width. The frequencies $n_i$ plotted against the @@ -194,13 +197,14 @@ categories $i$ is the \enterm{histogram}, or the \enterm{frequency \end{figure} Histograms are often used to estimate the -\enterm[probability!distribution]{probability distribution} of the -data values. +\enterm[probability!distribution]{probability distribution} +(\determ[Wahrscheinlichkeits!-verteilung]{Wahrscheinlichkeitsverteilung}) of the data values. \subsection{Probabilities} -In the frequentist interpretation of probability, the probability of -an event (e.g. getting a six when rolling a die) is the relative -occurrence of this event in the limit of a large number of trials. +In the frequentist interpretation of probability, the +\enterm{probability} (\determ{Wahrscheinlichkeit}) of an event +(e.g. getting a six when rolling a die) is the relative occurrence of +this event in the limit of a large number of trials. For a finite number of trials $N$ where the event $i$ occurred $n_i$ times, the probability $P_i$ of this event is estimated by @@ -212,15 +216,16 @@ the sum of the probabilities of all possible events is one: i.e. the probability of getting any event is one. -\subsection{Probability distributions of categorial data} +\subsection{Probability distributions of categorical data} -For categorial data values (e.g. the faces of a die (as integer -numbers or as colors)) a bin can be defined for each category $i$. -The histogram is normalized by the total number of measurements to -make it independent of the size of the data set -(\figref{diehistogramsfig}). After this normalization the height of -each histogram bar is an estimate of the probability $P_i$ of the -category $i$, i.e. of getting a data value in the $i$-th bin. +For \entermde[data!categorical]{Daten!kategorische}{categorical} data +values (e.g. the faces of a die (as integer numbers or as colors)) a +bin can be defined for each category $i$. The histogram is normalized +by the total number of measurements to make it independent of the size +of the data set (\figref{diehistogramsfig}). After this normalization +the height of each histogram bar is an estimate of the probability +$P_i$ of the category $i$, i.e. of getting a data value in the $i$-th +bin. \begin{exercise}{rollthedie.m}{} Write a function that simulates rolling a die $n$ times. @@ -236,12 +241,14 @@ category $i$, i.e. of getting a data value in the $i$-th bin. \subsection{Probability densities functions} -In cases where we deal with data sets of measurements of a real -quantity (e.g. lengths of snakes, weights of elephants, times -between succeeding spikes) there is no natural bin width for computing -a histogram. In addition, the probability of measuring a data value that -equals exactly a specific real number like, e.g., 0.123456789 is zero, because -there are uncountable many real numbers. +In cases where we deal with +\entermde[data!continuous]{Daten!kontinuierliche}{continuous data}, +(measurements of real-valued quantities, e.g. lengths of snakes, +weights of elephants, times between succeeding spikes) there is no +natural bin width for computing a histogram. In addition, the +probability of measuring a data value that equals exactly a specific +real number like, e.g., 0.123456789 is zero, because there are +uncountable many real numbers. We can only ask for the probability to get a measurement value in some range. For example, we can ask for the probability $P(1.2