fixed many index entries
This commit is contained in:
parent
f24c14e6f5
commit
bf52536b7b
5
Makefile
5
Makefile
@ -22,9 +22,10 @@ $(BASENAME).pdf : $(BASENAME).tex header.tex $(SUBTEXS)
|
||||
splitindex $(BASENAME).idx
|
||||
|
||||
index :
|
||||
pdflatex -interaction=scrollmode $(BASENAME).tex
|
||||
pdflatex $(BASENAME).tex
|
||||
splitindex $(BASENAME).idx
|
||||
pdflatex -interaction=scrollmode $(BASENAME).tex | tee /dev/stderr | fgrep -q "Rerun to get cross-references right" && pdflatex $(BASENAME).tex || true
|
||||
pdflatex $(BASENAME).tex
|
||||
pdflatex $(BASENAME).tex
|
||||
|
||||
again :
|
||||
pdflatex $(BASENAME).tex
|
||||
|
@ -171,7 +171,7 @@ A good example for the application of a
|
||||
assessment of \entermde[correlation]{Korrelation}{correlations}. Given
|
||||
are measured pairs of data points $(x_i, y_i)$. By calculating the
|
||||
\entermde[correlation!correlation
|
||||
coefficient]{Korrelation!Korrelationskoeffizient}{correlation
|
||||
coefficient]{Korrelation!-skoeffizient}{correlation
|
||||
coefficient} we can quantify how strongly $y$ depends on $x$. The
|
||||
correlation coefficient alone, however, does not tell whether the
|
||||
correlation is significantly different from a random correlation. The
|
||||
|
@ -1,4 +1,4 @@
|
||||
\chapter{\tr{Code style}{Programmierstil}}
|
||||
\chapter{Code style}
|
||||
|
||||
\shortquote{Any code of your own that you haven't looked at for six or
|
||||
more months might as well have been written by someone
|
||||
@ -33,7 +33,7 @@ by calling functions that work on the data and managing the
|
||||
results. Applying this structure makes it easy to understand the flow
|
||||
of the program but two questions remain: (i) How to organize the files
|
||||
on the file system and (ii) how to name them that the controlling
|
||||
script is easily identified among the other \codeterm{m-files}.
|
||||
script is easily identified among the other \codeterm[m-file]{m-files}.
|
||||
|
||||
Upon installation \matlab{} creates a folder called \file{MATLAB} in
|
||||
the user space (Windows: My files, Linux: Documents, MacOS:
|
||||
@ -43,7 +43,7 @@ moment. Of course, any other location can specified as well. Generally
|
||||
it is of great advantage to store related scripts and functions within
|
||||
the same folder on the hard drive. An easy approach is to create a
|
||||
project-specific folder structure that contains sub-folders for each
|
||||
task (analysis) and to store all related \codeterm{m-files}
|
||||
task (analysis) and to store all related \codeterm[m-file]{m-files}
|
||||
(screenshot \ref{fileorganizationfig}). In these task-related folders
|
||||
one may consider to create a further sub-folder to store results
|
||||
(created figures, result data). On the project level a single script
|
||||
@ -307,8 +307,8 @@ and to briefly explain what they do. Whenever one feels tempted to do
|
||||
this, one could also consider to delegate the respective task to a
|
||||
function. In most cases this is preferable.
|
||||
|
||||
Not delegating the tasks leads to very long \codeterm{m-files} which
|
||||
can be confusing. Sometimes such a code is called ``spaghetti
|
||||
Not delegating the tasks leads to very long \codeterm[m-file]{m-files}
|
||||
which can be confusing. Sometimes such a code is called ``spaghetti
|
||||
code''. It is high time to think about delegation of tasks to
|
||||
functions.
|
||||
|
||||
@ -323,17 +323,17 @@ functions.
|
||||
\end{important}
|
||||
|
||||
\subsection{Local and nested functions}
|
||||
Generally, functions live in their own \codeterm{m-files} that have
|
||||
the same name as the function itself. Delegating tasks to functions
|
||||
thus leads to a large set of \codeterm{m-files} which increases
|
||||
complexity and may lead to confusion. If the delegated functionality
|
||||
is used in multiple instances, it is still advisable to do so. On the
|
||||
other hand, when the delegated functionality is only used within the
|
||||
context of another function \matlab{} allows to define
|
||||
\codeterm[function!local]{local functions} and
|
||||
\codeterm[function!nested]{nested functions} within the same
|
||||
file. Listing \ref{localfunctions} shows an example of a local
|
||||
function definition.
|
||||
Generally, functions live in their own \codeterm[m-file]{m-files} that
|
||||
have the same name as the function itself. Delegating tasks to
|
||||
functions thus leads to a large set of \codeterm[m-file]{m-files}
|
||||
which increases complexity and may lead to confusion. If the delegated
|
||||
functionality is used in multiple instances, it is still advisable to
|
||||
do so. On the other hand, when the delegated functionality is only
|
||||
used within the context of another function \matlab{} allows to define
|
||||
\entermde[function!local]{Funktion!lokale}{local functions} and
|
||||
\entermde[function!nested]{Funktion!verschachtelte}{nested functions}
|
||||
within the same file. Listing \ref{localfunctions} shows an example of
|
||||
a local function definition.
|
||||
|
||||
\pagebreak[3] \lstinputlisting[label=localfunctions, caption={Example
|
||||
for local functions.}]{calculateSines.m}
|
||||
@ -408,11 +408,12 @@ advisable to adhere to these.
|
||||
Repeated tasks should (to be read as must) be delegated to
|
||||
functions. In cases in which a function is only locally applied and
|
||||
not of more global interest across projects consider to define it as
|
||||
\codeterm[function!local]{local function} or
|
||||
\codeterm[function!nested]{nested function}. Taking care to increase
|
||||
readability and comprehensibility pays off, even to the author!
|
||||
\footnote{Reading tip: Robert C. Martin: \textit{Clean Code: A Handbook of
|
||||
Agile Software Craftmanship}, Prentice Hall}
|
||||
\entermde[function!local]{Funktion!lokale}{local function} or
|
||||
\entermde[function!nested]{Funktion!verschachtelte}{nested
|
||||
function}. Taking care to increase readability and comprehensibility
|
||||
pays off, even to the author! \footnote{Reading tip: Robert
|
||||
C. Martin: \textit{Clean Code: A Handbook of Agile Software
|
||||
Craftmanship}, Prentice Hall}
|
||||
|
||||
\shortquote{Programs must be written for people to read, and only
|
||||
incidentally for machines to execute.}{Abelson / Sussman}
|
||||
|
@ -49,13 +49,14 @@ obscure logical errors! Take care when using the \codeterm{try-catch
|
||||
\end{important}
|
||||
|
||||
|
||||
\subsection{\codeterm{Syntax errors}}\label{syntax_error}
|
||||
The most common and easiest to fix type of error. A syntax error
|
||||
violates the rules (spelling and grammar) of the programming
|
||||
language. For example every opening parenthesis must be matched by a
|
||||
closing one or every \code{for} loop has to be closed by an
|
||||
\code{end}. Usually, the respective error messages are clear and
|
||||
the editor will point out and highlight most \codeterm{syntax error}s.
|
||||
\subsection{Syntax errors}\label{syntax_error}
|
||||
The most common and easiest to fix type of error. A
|
||||
\entermde[error!syntax]{Fehler!Syntax\~}{syntax error} violates the
|
||||
rules (spelling and grammar) of the programming language. For example
|
||||
every opening parenthesis must be matched by a closing one or every
|
||||
\code{for} loop has to be closed by an \code{end}. Usually, the
|
||||
respective error messages are clear and the editor will point out and
|
||||
highlight most syntax errors.
|
||||
|
||||
\begin{lstlisting}[label=syntaxerror, caption={Unbalanced parenthesis error.}]
|
||||
>> mean(random_numbers
|
||||
@ -66,8 +67,9 @@ Did you mean:
|
||||
>> mean(random_numbers)
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection{\codeterm{Indexing error}}\label{index_error}
|
||||
Second on the list of common errors are the indexing errors. Usually
|
||||
\subsection{Indexing error}\label{index_error}
|
||||
Second on the list of common errors are the
|
||||
\entermde[error!indexing]{Fehler!Index\~}{indexing errors}. Usually
|
||||
\matlab{} gives rather precise infromation about the cause, once you
|
||||
know what they mean. Consider the following code.
|
||||
|
||||
@ -111,14 +113,16 @@ to a number and uses this number to address the element in
|
||||
\varcode{my\_array}. The \codeterm{char} has the ASCII code 65 and
|
||||
thus the 65th element of \varcode{my\_array} is returned.
|
||||
|
||||
\subsection{\codeterm{Assignment error}}
|
||||
Related to the Indexing error, an assignment error occurs when we want
|
||||
to write data into a variable, that does not fit into it. Listing
|
||||
\ref{assignmenterror} shows the simple case for 1-d data but, of
|
||||
course, it extents to n-dimensional data. The data that is to be
|
||||
filled into a matrix hat to fit in all dimensions. The command in line
|
||||
7 works due to the fact, that matlab automatically extends the matrix,
|
||||
if you assign values to a range outside its bounds.
|
||||
\subsection{Assignment error}
|
||||
Related to the indexing error, an
|
||||
\entermde[error!assignment]{Fehler!Zuweisungs\~}{assignment error}
|
||||
occurs when we want to write data into a variable, that does not fit
|
||||
into it. Listing \ref{assignmenterror} shows the simple case for 1-d
|
||||
data but, of course, it extents to n-dimensional data. The data that
|
||||
is to be filled into a matrix hat to fit in all dimensions. The
|
||||
command in line 7 works due to the fact, that matlab automatically
|
||||
extends the matrix, if you assign values to a range outside its
|
||||
bounds.
|
||||
|
||||
\begin{lstlisting}[label=assignmenterror, caption={Assignment errors.}]
|
||||
>> a = zeros(1, 100);
|
||||
@ -133,21 +137,20 @@ ans =
|
||||
110 1
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection{\codeterm{Dimension mismatch error}}
|
||||
\subsection{Dimension mismatch error}
|
||||
Similarly, some arithmetic operations are only valid if the variables
|
||||
fulfill some size constraints. Consider the following commands
|
||||
(listing\,\ref{dimensionmismatch}). The first one (line 3) fails
|
||||
because we are trying to do al elementwise add on two vectors that
|
||||
have different lengths, respectively sizes. The matrix multiplication
|
||||
in line 6 also fails since for this operations to succeed the inner
|
||||
matrix dimensions must agree (for more information on the
|
||||
matrixmultiplication see box\,\ref{matrixmultiplication} in
|
||||
chapter\,\ref{programming}). The elementwise multiplication issued in
|
||||
line 10 fails for the same reason as the addition we tried
|
||||
earlier. Sometimes, however, things apparently work but the result may
|
||||
be surprising. The last operation in listing\,\ref{dimensionmismatch}
|
||||
does not throw an error but the result is something else than the
|
||||
expected elementwise multiplication.
|
||||
because we are trying to add two vectors of different lengths
|
||||
elementwise. The matrix multiplication in line 6 also fails since for
|
||||
this operations to succeed the inner matrix dimensions must agree (for
|
||||
more information on the matrixmultiplication see
|
||||
box\,\ref{matrixmultiplication} in chapter\,\ref{programming}). The
|
||||
elementwise multiplication issued in line 10 fails for the same reason
|
||||
as the addition we tried earlier. Sometimes, however, things
|
||||
apparently work but the result may be surprising. The last operation
|
||||
in listing\,\ref{dimensionmismatch} does not throw an error but the
|
||||
result is something else than the expected elementwise multiplication.
|
||||
|
||||
% XXX Some arithmetic operations make size constraints, violating them leads to dimension mismatch errors.
|
||||
\begin{lstlisting}[label=dimensionmismatch, caption={Dimension mismatch errors.}]
|
||||
@ -174,7 +177,8 @@ expected elementwise multiplication.
|
||||
\section{Logical error}
|
||||
Sometimes a program runs smoothly and terminates without any
|
||||
complaint. This, however, does not necessarily mean that the program
|
||||
is correct. We may have made a \codeterm{logical error}. Logical
|
||||
is correct. We may have made a
|
||||
\entermde[error!logical]{Fehler!logischer}{logical error}. Logical
|
||||
errors are hard to find, \matlab{} has no chance to detect such errors
|
||||
since they do not violate the syntax or cause the throwing of an
|
||||
error. Thus, we are on our own to find and fix the bug. There are a
|
||||
@ -283,7 +287,7 @@ validity.
|
||||
Matlab offers a unit testing framework in which small scripts are
|
||||
written that test the features of the program. We will follow the
|
||||
example given in the \matlab{} help and assume that there is a
|
||||
function \code{rightTriangle} (listing\,\ref{trianglelisting}).
|
||||
function \varcode{rightTriangle()} (listing\,\ref{trianglelisting}).
|
||||
|
||||
% XXX Slightly more readable version of the example given in the \matlab{} help system. Note: The variable name for the angles have been capitalized in order to not override the matlab defined functions \code{alpha, beta,} and \code{gamma}.
|
||||
\begin{lstlisting}[label=trianglelisting, caption={Example function for unit testing.}]
|
||||
@ -308,7 +312,7 @@ folder that follows the following rules.
|
||||
\item The name of the script file must start or end with the word
|
||||
'test', which is case-insensitive.
|
||||
\item Each unit test should be placed in a separate section/cell of the script.
|
||||
\item After the \code{\%\%} that defines the cell, a name for the
|
||||
\item After the \mcode{\%\%} that defines the cell, a name for the
|
||||
particular unit test may be given.
|
||||
\end{enumerate}
|
||||
|
||||
@ -328,11 +332,11 @@ Further there are a few things that are different in tests compared to normal sc
|
||||
tests.
|
||||
\end{enumerate}
|
||||
|
||||
The test script for the \code{rightTrianlge} function
|
||||
The test script for the \varcode{rightTriangle()} function
|
||||
(listing\,\ref{trianglelisting}) may look like in
|
||||
listing\,\ref{testscript}.
|
||||
|
||||
\begin{lstlisting}[label=testscript, caption={Unit test for the \code{rightTriangle} function stored in an m-file testRightTriangle.m}]
|
||||
\begin{lstlisting}[label=testscript, caption={Unit test for the \varcode{rightTriangle()} function stored in an m-file testRightTriangle.m}]
|
||||
tolerance = 1e-10;
|
||||
|
||||
% preconditions
|
||||
@ -372,7 +376,7 @@ assert(abs(approx - smallAngle) <= tolerance, 'Problem with small angle approxim
|
||||
|
||||
In a test script we can execute any code. The actual test whether or
|
||||
not the results match our predictions is done using the
|
||||
\code{assert()}{assert} function. This function basically expects a
|
||||
\code{assert()} function. This function basically expects a
|
||||
boolean value and if this is not true, it raises an error that, in the
|
||||
context of the test does not lead to a termination of the program. In
|
||||
the tests above, the argument to assert is always a boolean expression
|
||||
@ -392,7 +396,7 @@ result = runtests('testRightTriangle')
|
||||
During the run, \matlab{} will put out error messages onto the command
|
||||
line and a summary of the test results is then stored within the
|
||||
\varcode{result} variable. These can be displayed using the function
|
||||
\code{table(result)}
|
||||
\code[table()]{table(result)}.
|
||||
|
||||
\begin{lstlisting}[label=testresults, caption={The test results.}, basicstyle=\ttfamily\scriptsize]
|
||||
table(result)
|
||||
@ -431,7 +435,7 @@ that help to solve the problem.
|
||||
\item No idea what the error message is trying to say? Google it!
|
||||
\item Read the program line by line and understand what each line is
|
||||
doing.
|
||||
\item Use \code{disp} to print out relevant information on the command
|
||||
\item Use \code{disp()} to print out relevant information on the command
|
||||
line and compare the output with your expectations. Do this step by
|
||||
step and start at the beginning.
|
||||
\item Use the \matlab{} debugger to stop execution of the code at a
|
||||
|
@ -217,7 +217,7 @@
|
||||
% the english index.
|
||||
\newcommand{\enterm}[2][]{\textit{#2}\ifthenelse{\equal{#1}{}}{\protect\sindex[enterm]{#2}}{\protect\sindex[enterm]{#1}}}
|
||||
|
||||
% \endeterm[english index entry]{<german index entry>}{<english term>}
|
||||
% \entermde[english index entry]{<german index entry>}{<english term>}
|
||||
% typeset the english term in italics and add it (or the first
|
||||
% optional argument) to the english index. In addition add the german
|
||||
% index entry to the german index without printing it.
|
||||
@ -270,7 +270,7 @@
|
||||
\newcommand{\pythonfun}[1]{(\tr{\python-function}{\python-Funktion} \varcode{#1})\protect\sindex[pcode]{#1}}
|
||||
|
||||
% typeset '(matlab-function #1)' and add the function to the matlab index:
|
||||
\newcommand{\matlabfun}[1]{(\tr{\matlab-function}{\matlab-Funktion} \varcode{#1})\protect\sindex[mcode]{#1}}
|
||||
\newcommand{\matlabfun}[1]{(function \varcode{#1})\protect\sindex[mcode]{#1}}
|
||||
|
||||
|
||||
%%%%% shortquote and widequote commands: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
|
@ -26,15 +26,16 @@ parameters $\theta$. This could be the normal distribution
|
||||
defined by the mean $\mu$ and the standard deviation $\sigma$ as
|
||||
parameters $\theta$. If the $n$ independent observations of $x_1,
|
||||
x_2, \ldots x_n$ originate from the same probability density
|
||||
distribution (they are \enterm{i.i.d.} independent and identically
|
||||
distributed) then the conditional probability $p(x_1,x_2, \ldots
|
||||
distribution (they are \enterm[i.i.d.|see{independent and identically
|
||||
distributed}]{i.i.d.}, \enterm{independent and identically
|
||||
distributed}) then the conditional probability $p(x_1,x_2, \ldots
|
||||
x_n|\theta)$ of observing $x_1, x_2, \ldots x_n$ given a specific
|
||||
$\theta$ is given by
|
||||
\begin{equation}
|
||||
p(x_1,x_2, \ldots x_n|\theta) = p(x_1|\theta) \cdot p(x_2|\theta)
|
||||
\ldots p(x_n|\theta) = \prod_{i=1}^n p(x_i|\theta) \; .
|
||||
\end{equation}
|
||||
Vice versa, the \enterm{likelihood} of the parameters $\theta$
|
||||
Vice versa, the \entermde{Likelihood}{likelihood} of the parameters $\theta$
|
||||
given the observed data $x_1, x_2, \ldots x_n$ is
|
||||
\begin{equation}
|
||||
{\cal L}(\theta|x_1,x_2, \ldots x_n) = p(x_1,x_2, \ldots x_n|\theta) \; .
|
||||
@ -57,7 +58,7 @@ The position of a function's maximum does not change when the values
|
||||
of the function are transformed by a strictly monotonously rising
|
||||
function such as the logarithm. For numerical and reasons that we will
|
||||
discuss below, we commonly search for the maximum of the logarithm of
|
||||
the likelihood (\enterm{log-likelihood}):
|
||||
the likelihood (\entermde[likelihood!log-]{Likelihood!Log-}{log-likelihood}):
|
||||
|
||||
\begin{eqnarray}
|
||||
\theta_{mle} & = & \text{argmax}_{\theta}\; {\cal L}(\theta|x_1,x_2, \ldots x_n) \nonumber \\
|
||||
@ -136,9 +137,10 @@ from the data.
|
||||
For non-Gaussian distributions (e.g. a Gamma-distribution), however,
|
||||
such simple analytical expressions for the parameters of the
|
||||
distribution do not exist, e.g. the shape parameter of a
|
||||
\enterm{Gamma-distribution}. How do we fit such a distribution to
|
||||
some data? That is, how should we compute the values of the parameters
|
||||
of the distribution, given the data?
|
||||
\entermde[distribution!Gamma-]{Verteilung!Gamma-}{Gamma-distribution}. How
|
||||
do we fit such a distribution to some data? That is, how should we
|
||||
compute the values of the parameters of the distribution, given the
|
||||
data?
|
||||
|
||||
A first guess could be to fit the probability density function by
|
||||
minimization of the squared difference to a histogram of the measured
|
||||
@ -289,10 +291,10 @@ out of \eqnref{mleslope} and we get
|
||||
To see what this expression is, we need to standardize the data. We
|
||||
make the data mean free and normalize them to their standard
|
||||
deviation, i.e. $x \mapsto (x - \bar x)/\sigma_x$. The resulting
|
||||
numbers are also called \enterm[z-values]{$z$-values} or $z$-scores and they
|
||||
have the property $\bar x = 0$ and $\sigma_x = 1$. $z$-scores are
|
||||
often used in Biology to make quantities that differ in their units
|
||||
comparable. For standardized data the variance
|
||||
numbers are also called \entermde[z-values]{z-Wert}{$z$-values} or
|
||||
$z$-scores and they have the property $\bar x = 0$ and $\sigma_x =
|
||||
1$. $z$-scores are often used in Biology to make quantities that
|
||||
differ in their units comparable. For standardized data the variance
|
||||
\[ \sigma_x^2 = \frac{1}{n} \sum_{i=1}^n (x_i - \bar x)^2 = \frac{1}{n} \sum_{i=1}^n x_i^2 = 1 \]
|
||||
is given by the mean squared data and equals one.
|
||||
The covariance between $x$ and $y$ also simplifies to
|
||||
|
@ -112,16 +112,15 @@ the missing information ourselves. Thus, we need a second variable
|
||||
that contains the respective \varcode{x} values. The length of
|
||||
\varcode{x} and \varcode{y} must be the same otherwise the later call
|
||||
of the \varcode{plot} function will raise an error. The respective
|
||||
call will expand to \code[plot()]{plot(x, y)}. The x-axis will be now
|
||||
be scaled from the minimum in \varcode{x} to the maximum of
|
||||
\varcode{x} and by default it will be plotted as a line plot with a
|
||||
solid blue line of the linewidth 1pt. A second plot that is added to the
|
||||
figure will be plotted in red using the same settings. The
|
||||
order of the used colors depends on the \enterm{colormap} settings
|
||||
which can be adjusted to personal taste or
|
||||
need. Table\,\ref{plotlinestyles} shows some predefined values that
|
||||
can be chosen for the line style, the marker, or the color. For
|
||||
additional options consult the help.
|
||||
call will expand to \code[plot()]{plot(x, y)}. The x-axis will now be
|
||||
scaled from the minimum in \varcode{x} to the maximum of \varcode{x}
|
||||
and by default it will be plotted as a line plot with a solid blue
|
||||
line of the linewidth 1pt. A second plot that is added to the figure
|
||||
will be plotted in red using the same settings. The order of the used
|
||||
colors depends on the \enterm{colormap} settings which can be adjusted
|
||||
to personal taste or need. Table\,\ref{plotlinestyles} shows some
|
||||
predefined values that can be chosen for the line style, the marker,
|
||||
or the color. For additional options consult the help.
|
||||
|
||||
\begin{table}[htp]
|
||||
\titlecaption{Predefined line styles (left), colors (center) and
|
||||
@ -184,8 +183,8 @@ chosen.
|
||||
\subsection{Changing the axes properties}
|
||||
|
||||
The first thing a plot needs are axis labels with correct units. By
|
||||
calling the functions \code[xlabel]{xlabel('Time [ms]')} and
|
||||
\code[ylabel]{ylabel('Voltage [mV]')} these can be set. By default the
|
||||
calling the functions \code[xlabel()]{xlabel('Time [ms]')} and
|
||||
\code[ylabel()]{ylabel('Voltage [mV]')} these can be set. By default the
|
||||
axes will be scaled to show the full extent of the data. The extremes
|
||||
will be selected as the closest integer for small values or the next
|
||||
full multiple of tens, hundreds, thousands, etc.\ depending on the
|
||||
@ -196,8 +195,8 @@ functions expect a single argument, that is a 2-element vector
|
||||
containing the minimum and maximum value. Table\,\ref{plotaxisprops}
|
||||
lists some of the commonly adjusted properties of an axis. To set
|
||||
these properties, we need to have the axes object which can either be
|
||||
stored in a variable when calling \varcode{plot} (\code{axes =
|
||||
plot(x,y);}) or can be retrieved using the \code[gca]{gca} function
|
||||
stored in a variable when calling \varcode{plot} (\varcode{axes =
|
||||
plot(x,y);}) or can be retrieved using the \code{gca()} function
|
||||
(gca stands for ``get current axes''). Changing the properties of the axes
|
||||
object will update the plot (listing\,\ref{niceplotlisting}).
|
||||
|
||||
@ -253,8 +252,8 @@ and the placement of the axes on the
|
||||
paper. Table\,\ref{plotfigureprops} lists commonly used
|
||||
properties. For a complete reference check the help. To change the
|
||||
figure's appearance, we need to change the properties of the figure
|
||||
object which can be retrieved during creation of the figure (\code{fig
|
||||
= figure();}) or by using the \code{gcf} (``get current figure'')
|
||||
object which can be retrieved during creation of the figure (\code[figure()]{fig
|
||||
= figure();}) or by using the \code{gcf()} (``get current figure'')
|
||||
command.
|
||||
|
||||
The script shown in the listing\,\ref{niceplotlisting} exemplifies
|
||||
@ -334,10 +333,10 @@ the last one defines the output format (box\,\ref{graphicsformatbox}).
|
||||
properties could be read and set using the functions
|
||||
\code[get()]{get} and \code[set()]{set}. The first argument these
|
||||
functions expect are valid figure or axis \emph{handles} which were
|
||||
returned by the \code{figure} and \code{plot} functions, or could be
|
||||
retrieved using \code[gcf()]{gcf} or \code[gca()]{gca} for the
|
||||
returned by the \code{figure()} and \code{plot()} functions, or could be
|
||||
retrieved using \code{gcf()} or \code{gca()} for the
|
||||
current figure or axis handle, respectively. Subsequent arguments
|
||||
passed to \code{set} are pairs of a property's name and the desired
|
||||
passed to \code{set()} are pairs of a property's name and the desired
|
||||
value.
|
||||
\begin{lstlisting}[caption={Using set to change figure and axis properties.}]
|
||||
frequency = 5; % frequency of the sine wave in Hz
|
||||
@ -351,8 +350,8 @@ the last one defines the output format (box\,\ref{graphicsformatbox}).
|
||||
set(figure_handle, 'PaperSize', [5.5, 5.5], 'PaperUnit', 'centimeters', ...
|
||||
'PaperPosition', [0, 0, 5.5, 5.5]);
|
||||
\end{lstlisting}
|
||||
With newer versions the handles returned by \varcode{gcf} and
|
||||
\varcode{gca} are ``objects'' and setting properties became much
|
||||
With newer versions the handles returned by \code{gcf()} and
|
||||
\code{gca()} are ``objects'' and setting properties became much
|
||||
easier as it is used throughout this chapter. For downward
|
||||
compatibility with older versions set and get still work in current
|
||||
versions of \matlab{}.
|
||||
@ -371,7 +370,7 @@ For some types of plots we present examples in the following sections.
|
||||
\subsection{Scatter}
|
||||
|
||||
For displaying events or pairs of x-y coordinates the standard line
|
||||
plot is not optimal. Rather, we use \code[scatter()]{scatter} for this
|
||||
plot is not optimal. Rather, we use \code{scatter()} for this
|
||||
purpose. For example, we have a number of measurements of a system's
|
||||
response to a certain stimulus intensity. There is no dependency
|
||||
between the data points, drawing them with a line-plot would be
|
||||
@ -417,8 +416,8 @@ A very common scenario is to combine several plots in the same
|
||||
figure. To do this we create so-called subplots
|
||||
figures\,\ref{regularsubplotsfig},\,\ref{irregularsubplotsfig}. The
|
||||
\code[subplot()]{subplot()} command allows to place multiple axes onto
|
||||
a single sheet of paper. Generally, \varcode{subplot} expects three argument
|
||||
defining the number of rows, column, and the currently active
|
||||
a single sheet of paper. Generally, \code{subplot()} expects three
|
||||
argument defining the number of rows, column, and the currently active
|
||||
plot. The currently active plot number starts with 1 and goes up to
|
||||
$rows \cdot columns$ (numbers in the subplots in
|
||||
figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}).
|
||||
@ -439,7 +438,7 @@ figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}).
|
||||
By default, all subplots have the same size, if something else is
|
||||
desired, e.g.\ one subplot should span a whole row, while two others
|
||||
are smaller and should be placed side by side in the same row, the
|
||||
third argument of \varcode{subplot} can be a vector or numbers that
|
||||
third argument of \code{subplot()} can be a vector or numbers that
|
||||
should be joined. These have, of course, to be adjacent numbers
|
||||
(\figref{irregularsubplotsfig},
|
||||
listing\,\ref{irregularsubplotslisting}).
|
||||
@ -457,7 +456,7 @@ columns, need to be used in a plot. If you want to create something
|
||||
more elaborate, or have more spacing between the subplots one can
|
||||
create a grid with larger numbers of columns and rows, and specify the
|
||||
used cells of the grid by passing a vector as the third argument to
|
||||
\varcode{subplot}.
|
||||
\code{subplot()}.
|
||||
|
||||
\lstinputlisting[caption={Script for creating subplots of different
|
||||
sizes \figref{irregularsubplotsfig}.},
|
||||
@ -498,12 +497,12 @@ more apt. Accordingly, four arguments are needed (line 12 in listing
|
||||
\ref{errorbarlisting}). The first two arguments are the same, the next
|
||||
to represent the positive and negative deflections.
|
||||
|
||||
By default the \code{errorbar} function does not draw a marker. In the
|
||||
By default the \code{errorbar()} function does not draw a marker. In the
|
||||
examples shown here we provide extra arguments to define that a circle
|
||||
is used for that purpose. The line connecting the average values can
|
||||
be removed by passing additional arguments. The properties of the
|
||||
errorbars themselves (linestyle, linewidth, capsize, etc.) can be
|
||||
changed by taking the return argument of \code{errorbar} and changing
|
||||
changed by taking the return argument of \code{errorbar()} and changing
|
||||
its properties. See the \matlab{} help for more information.
|
||||
|
||||
\begin{figure}[ht]
|
||||
@ -530,18 +529,18 @@ areas instead of errorbars: In case you have a lot of data points with
|
||||
respective errorbars such that they would merge in the figure it is
|
||||
cleaner and probably easier to read and handle if one uses an error
|
||||
area instead. To achieve an illustration as shown in
|
||||
figure\,\ref{errorbarplot} C, we use the \code{fill} command in
|
||||
figure\,\ref{errorbarplot} C, we use the \code{fill()} command in
|
||||
combination with a standard line plot. The original purpose of
|
||||
\code{fill} is to draw a filled polygon. We hence have to provide it
|
||||
\code{fill()} is to draw a filled polygon. We hence have to provide it
|
||||
with the vertex points of the polygon. For each x-value we now have
|
||||
two y-values (average minus error and average plus error). Further, we
|
||||
want the vertices to be connected in a defined order. One can achieve
|
||||
this by going back and forth on the x-axis; we append a reversed
|
||||
version of the x-values to the original x-values using \code{cat} and
|
||||
\code{fliplr} for concatenation and inversion, respectively (line 3 in
|
||||
version of the x-values to the original x-values using \code{cat()} and
|
||||
\code{fliplr()} for concatenation and inversion, respectively (line 3 in
|
||||
listing \ref{errorbarlisting2}; Depending on the layout of your data
|
||||
you may need concatenate along a different dimension of the data and
|
||||
use \code{flipud} instead). The y-coordinates of the polygon vertices
|
||||
use \code{flipud()} instead). The y-coordinates of the polygon vertices
|
||||
are concatenated in a similar way (line 4). In the example shown here
|
||||
we accept the polygon object that is returned by fill (variable p) and
|
||||
use it to change a few properties of the polygon. The \emph{FaceAlpha}
|
||||
@ -561,9 +560,9 @@ connecting the average values (line 12).
|
||||
The \code[text()]{text()} or \code[annotation()]{annotation()} are
|
||||
used for highlighting certain parts of a plot or simply adding an
|
||||
annotation that does not fit or does not belong into the legend.
|
||||
While \varcode{text} simply prints out the given text string at the
|
||||
While \code{text()} simply prints out the given text string at the
|
||||
defined position (for example line in
|
||||
listing\,\ref{regularsubplotlisting}) the \varcode{annotation}
|
||||
listing\,\ref{regularsubplotlisting}) the \code{annotation()}
|
||||
function allows to add some more advanced highlights like arrows,
|
||||
lines, ellipses, or rectangles. Figure\,\ref{annotationsplot} shows
|
||||
some examples, the respective code can be found in
|
||||
@ -583,9 +582,9 @@ listing\,\ref{annotationsplotlisting}. For more options consult the
|
||||
|
||||
\begin{important}[Positions in data or figure coordinates.]
|
||||
A very confusing pitfall are the different coordinate systems used
|
||||
by \varcode{text} and \varcode{annotation}. While \varcode{text}
|
||||
by \varcode{text()} and \varcode{annotation()}. While \varcode{text()}
|
||||
expects the positions to be in data coordinates, i.e.\,in the limits
|
||||
of the x- and y-axis, \varcode{annotation} requires the positions to
|
||||
of the x- and y-axis, \varcode{annotation()} requires the positions to
|
||||
be given in normalized figure coordinates. Normalized means that the
|
||||
width and height of the figure are expressed by numbers in the range
|
||||
0 to 1. The bottom/left corner then has the coordinates $(0,0)$ and
|
||||
@ -624,9 +623,9 @@ Lissajous figure. The basic steps are:
|
||||
is created and opened for writing. This also implies that is has to
|
||||
be closed after the whole process (line 31).
|
||||
\item For each frame of the video, we plot the appropriate data (we
|
||||
use \code[scatter]{scatter} for this purpose, line 20) and ``grab''
|
||||
use \code{scatter()} for this purpose, line 20) and ``grab''
|
||||
the frame (line 28). Grabbing is similar to making a screenshot of
|
||||
the figure. The \code{drawnow}{drawnow} command (line 27) is used to
|
||||
the figure. The \code{drawnow()} command (line 27) is used to
|
||||
stop the excution of the for loop until the drawing process is
|
||||
finished.
|
||||
\item Write the frame to file (line 29).
|
||||
|
@ -73,10 +73,10 @@ number of observed events within a certain time window $n_i$
|
||||
(\figref{pointprocessscetchfig}).
|
||||
|
||||
\begin{exercise}{rasterplot.m}{}
|
||||
Implement a function \code{rasterplot()} that displays the times of
|
||||
action potentials within the first \code{tmax} seconds in a raster
|
||||
Implement a function \varcode{rasterplot()} that displays the times of
|
||||
action potentials within the first \varcode{tmax} seconds in a raster
|
||||
plot. The spike times (in seconds) recorded in the individual trials
|
||||
are stored as vectors of times within a \codeterm{cell-array}.
|
||||
are stored as vectors of times within a \codeterm{cell array}.
|
||||
\end{exercise}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
@ -95,10 +95,10 @@ describing the statistics of stochastic real-valued variables:
|
||||
\end{figure}
|
||||
|
||||
\begin{exercise}{isis.m}{}
|
||||
Implement a function \code{isis()} that calculates the interspike
|
||||
Implement a function \varcode{isis()} that calculates the interspike
|
||||
intervals from several spike trains. The function should return a
|
||||
single vector of intervals. The spike times (in seconds) of each
|
||||
trial are stored as vectors within a \codeterm{cell-array}.
|
||||
trial are stored as vectors within a cell-array.
|
||||
\end{exercise}
|
||||
|
||||
%\subsection{First order interval statistics}
|
||||
@ -117,7 +117,7 @@ describing the statistics of stochastic real-valued variables:
|
||||
\end{itemize}
|
||||
|
||||
\begin{exercise}{isihist.m}{}
|
||||
Implement a function \code{isiHist()} that calculates the normalized
|
||||
Implement a function \varcode{isiHist()} that calculates the normalized
|
||||
interspike interval histogram. The function should take two input
|
||||
arguments; (i) a vector of interspike intervals and (ii) the width
|
||||
of the bins used for the histogram. It further returns the
|
||||
@ -126,7 +126,7 @@ describing the statistics of stochastic real-valued variables:
|
||||
|
||||
\begin{exercise}{plotisihist.m}{}
|
||||
Implement a function that takes the return values of
|
||||
\code{isiHist()} as input arguments and then plots the data. The
|
||||
\varcode{isiHist()} as input arguments and then plots the data. The
|
||||
plot should show the histogram with the x-axis scaled to
|
||||
milliseconds and should be annotated with the average ISI, the
|
||||
standard deviation and the coefficient of variation.
|
||||
@ -167,7 +167,7 @@ $\rho_k$ is usually plotted against the lag $k$
|
||||
with itself and is always 1.
|
||||
|
||||
\begin{exercise}{isiserialcorr.m}{}
|
||||
Implement a function \code{isiserialcorr()} that takes a vector of
|
||||
Implement a function \varcode{isiserialcorr()} that takes a vector of
|
||||
interspike intervals as input argument and calculates the serial
|
||||
correlation. The function should further plot the serial
|
||||
correlation. \pagebreak[4]
|
||||
@ -213,12 +213,12 @@ time interval , \determ{Feuerrate}) that is given in Hertz
|
||||
% \end{figure}
|
||||
|
||||
\begin{exercise}{counthist.m}{}
|
||||
Implement a function \code{counthist()} that calculates and plots
|
||||
Implement a function \varcode{counthist()} that calculates and plots
|
||||
the distribution of spike counts observed in a certain time
|
||||
window. The function should take two input arguments: (i) a
|
||||
\codeterm{cell-array} of vectors containing the spike times in
|
||||
seconds observed in a number of trials, and (ii) the duration of the
|
||||
time window that is used to evaluate the counts.\pagebreak[4]
|
||||
cell-array of vectors containing the spike times in seconds observed
|
||||
in a number of trials, and (ii) the duration of the time window that
|
||||
is used to evaluate the counts.\pagebreak[4]
|
||||
\end{exercise}
|
||||
|
||||
|
||||
@ -244,7 +244,7 @@ In an \enterm[Poisson process!inhomogeneous]{inhomogeneous Poisson
|
||||
\lambda(t)$.
|
||||
|
||||
\begin{exercise}{poissonspikes.m}{}
|
||||
Implement a function \code{poissonspikes()} that uses a homogeneous
|
||||
Implement a function \varcode{poissonspikes()} that uses a homogeneous
|
||||
Poisson process to generate events at a given rate for a certain
|
||||
duration and a number of trials. The rate should be given in Hertz
|
||||
and the duration of the trials is given in seconds. The function
|
||||
@ -293,7 +293,7 @@ The homogeneous Poisson process has the following properties:
|
||||
\end{itemize}
|
||||
|
||||
\begin{exercise}{hompoissonspikes.m}{}
|
||||
Implement a function \code{hompoissonspikes()} that uses a
|
||||
Implement a function \varcode{hompoissonspikes()} that uses a
|
||||
homogeneous Poisson process to generate spike events at a given rate
|
||||
for a certain duration and a number of trials. The rate should be
|
||||
given in Hertz and the duration of the trials is given in
|
||||
@ -422,7 +422,7 @@ potentials (\figref{binpsthfig} top). The resulting histogram is then
|
||||
normalized with the bin width $W$ to yield the firing rate shown in
|
||||
the bottom trace of figure \ref{binpsthfig}. The above sketched
|
||||
process is equivalent to estimating the probability density. It is
|
||||
possible to estimate the PSTH using the \code{hist()} method
|
||||
possible to estimate the PSTH using the \code{hist()} function
|
||||
\sindex[term]{Feuerrate!Binningmethode}
|
||||
|
||||
The estimated firing rate is valid for the total duration of each
|
||||
|
@ -112,7 +112,7 @@ x y z
|
||||
|
||||
\begin{important}[Naming conventions]
|
||||
There are a few rules regarding variable names. \matlab{} is
|
||||
case-sensitive, i.e. \code{x} and \code{X} are two different
|
||||
case-sensitive, i.e. \varcode{x} and \varcode{X} are two different
|
||||
names. Names must begin with an alphabetic character. German (or
|
||||
other) umlauts, special characters and spaces are forbidden in
|
||||
variable names.
|
||||
@ -689,9 +689,11 @@ then compare it to the elements on each page, and so on. An
|
||||
alternative way is to make use of the so called \emph{linear indexing}
|
||||
in which each element of the matrix is addressed by a single
|
||||
number. The linear index thus ranges from 1 to
|
||||
\code{numel(matrix)}. The linear index increases first along the 1st,
|
||||
2nd, 3rd etc. dimension (figure~\ref{matrixlinearindexingfig}). It is
|
||||
not as intuitive since one would need to know the shape of the matrix and perform a remapping, but can be really helpful
|
||||
\code[numel()]{numel(matrix)}. The linear index increases first along
|
||||
the 1st, 2nd, 3rd etc. dimension
|
||||
(figure~\ref{matrixlinearindexingfig}). It is not as intuitive since
|
||||
one would need to know the shape of the matrix and perform a
|
||||
remapping, but can be really helpful
|
||||
(listing~\ref{matrixLinearIndexing}).
|
||||
|
||||
|
||||
@ -882,10 +884,11 @@ table~\ref{logicaloperators}) which are introduced in the following
|
||||
sections.
|
||||
|
||||
\subsection{Relational operators}
|
||||
With \codeterm[Operator!relational]{relational operators} (table~\ref{relationaloperators})
|
||||
we can ask questions such as: ''Is the value of variable \code{a}
|
||||
larger than the value of \code{b}?'' or ``Is the value in \code{a}
|
||||
equal to the one stored in variable \code{b}?''.
|
||||
With \codeterm[Operator!relational]{relational operators}
|
||||
(table~\ref{relationaloperators}) we can ask questions such as: ''Is
|
||||
the value of variable \varcode{a} larger than the value of
|
||||
\varcode{b}?'' or ``Is the value in \varcode{a} equal to the one
|
||||
stored in variable \varcode{b}?''.
|
||||
|
||||
\begin{table}[h!]
|
||||
\titlecaption{\label{relationaloperators}
|
||||
@ -930,13 +933,13 @@ Testing the relations between numbers and scalar variables is straight
|
||||
forward. When comparing vectors, the relational operator will be
|
||||
applied element-wise and compare the respective elements of the
|
||||
left-hand-side and right-hand-side vectors. Note: vectors must have
|
||||
the same length and orientation. The result of \code{[2 0 0 5 0] == [1
|
||||
the same length and orientation. The result of \varcode{[2 0 0 5 0] == [1
|
||||
0 3 2 0]'} in which the second vector is transposed to give a
|
||||
column vector is a matrix!
|
||||
|
||||
\subsection{Logical operators}
|
||||
With the relational operators we could for example test whether a
|
||||
number is greater than a certain threshold (\code{x > 0.25}). But what
|
||||
number is greater than a certain threshold (\varcode{x > 0.25}). But what
|
||||
if we wanted to check whether the number falls into the range greater
|
||||
than 0.25 but less than 0.75? Numbers that fall into this range must
|
||||
satisfy the one and the other condition. With
|
||||
@ -1068,13 +1071,13 @@ values stored in a vector or matrix. It is very powerful and, once
|
||||
understood, very intuitive.
|
||||
|
||||
The basic concept is that applying a Boolean operation on a vector
|
||||
results in a \code{logical} vector of the same size (see
|
||||
results in a \codeterm{logical} vector of the same size (see
|
||||
listing~\ref{logicaldatatype}). This logical vector is then used to
|
||||
select only those values for which the logical vector is true. Line 14
|
||||
in listing~\ref{logicalindexing1} can be read: ``Select all those
|
||||
elements of \varcode{x} where the Boolean expression \varcode{x < 0}
|
||||
evaluates to true and store the result in the variable
|
||||
\emph{x\_smaller\_zero}''.
|
||||
\varcode{x\_smaller\_zero}''.
|
||||
|
||||
\begin{lstlisting}[caption={Logical indexing.}, label=logicalindexing1]
|
||||
>> x = randn(1, 6) % a vector with 6 random numbers
|
||||
@ -1154,13 +1157,14 @@ segment of data of a certain time span (the stimulus was on,
|
||||
data and metadata in a single variable.
|
||||
|
||||
\textbf{Cell arrays} Arrays of variables that contain different
|
||||
types. Unlike structures, the entries of a \codeterm{Cell array} are
|
||||
not named. Indexing in \codeterm{Cell arrays} requires a special
|
||||
operator the \code{\{\}}. \matlab{} uses \codeterm{Cell arrays} for
|
||||
example when strings of different lengths should be stored in the
|
||||
same variable: \varcode{months = \{'Januar', 'February', 'March',
|
||||
'April', 'May', 'Jun'\};}. Note the curly braces that are used to
|
||||
create the array and are also used for indexing.
|
||||
types. Unlike structures, the entries of a \codeterm{cell array} are
|
||||
not named. Indexing in \codeterm[cell array]{cell arrays} requires a
|
||||
special operator the \code{\{\}}. \matlab{} uses \codeterm[cell
|
||||
array]{cell arrays} for example when strings of different lengths
|
||||
should be stored in the same variable: \varcode{months = \{'Januar',
|
||||
'February', 'March', 'April', 'May', 'Jun'\};}. Note the curly
|
||||
braces that are used to create the array and are also used for
|
||||
indexing.
|
||||
|
||||
\textbf{Tables} Tabular structure that allows to have columns of
|
||||
varying type combined with a header (much like a spreadsheet).
|
||||
@ -1170,8 +1174,8 @@ segment of data of a certain time span (the stimulus was on,
|
||||
irregular intervals togehter with the measurement time in a single
|
||||
variable. Without the \codeterm{Timetable} data type at least two
|
||||
variables (one storing the time, the other the measurement) would be
|
||||
required. \codeterm{Timetables} offer specific convenience functions
|
||||
to work with timestamps.
|
||||
required. \codeterm[Timetable]{Timetables} offer specific
|
||||
convenience functions to work with timestamps.
|
||||
|
||||
\textbf{Maps} In a \codeterm{map} a \codeterm{value} is associated
|
||||
with an arbitrary \codeterm{key}. The \codeterm{key} is not
|
||||
@ -1246,7 +1250,7 @@ All imperative programming languages offer a solution: the loop. It is
|
||||
used whenever the same commands have to be repeated.
|
||||
|
||||
|
||||
\subsubsection{The \code{for} --- loop}
|
||||
\subsubsection{The \varcode{for} --- loop}
|
||||
The most common type of loop is the \codeterm{for-loop}. It
|
||||
consists of a \codeterm[Loop!head]{head} and the
|
||||
\codeterm[Loop!body]{body}. The head defines how often the code in the
|
||||
@ -1258,7 +1262,7 @@ next value of this vector. In the body of the loop any code can be
|
||||
executed which may or may not use the running variable for a certain
|
||||
purpose. The \code{for} loop is closed with the keyword
|
||||
\code{end}. Listing~\ref{looplisting} shows a simple version of such a
|
||||
\code{for} loop.
|
||||
\codeterm{for-loop}.
|
||||
|
||||
\begin{lstlisting}[caption={Example of a \varcode{for}-loop.}, label=looplisting]
|
||||
>> for x = 1:3 % head
|
||||
@ -1273,15 +1277,15 @@ purpose. The \code{for} loop is closed with the keyword
|
||||
|
||||
|
||||
\begin{exercise}{factorialLoop.m}{factorialLoop.out}
|
||||
Can we solve the factorial with a for-loop? Implement a for loop that
|
||||
calculates the factorial of a number \varcode{n}.
|
||||
Can we solve the factorial with a \varcode{for}-loop? Implement a
|
||||
for loop that calculates the factorial of a number \varcode{n}.
|
||||
\end{exercise}
|
||||
|
||||
|
||||
\subsubsection{The \varcode{while} --- loop}
|
||||
|
||||
The \code{while}--loop is the second type of loop that is available in
|
||||
almost all programming languages. Other, than the \code{for} -- loop,
|
||||
The \codeterm{while-loop} is the second type of loop that is available in
|
||||
almost all programming languages. Other, than the \codeterm{for-loop},
|
||||
that iterates with the running variable over a vector, the while loop
|
||||
uses a Boolean expression to determine when to execute the code in
|
||||
it's body. The head of the loop starts with the keyword \code{while}
|
||||
@ -1289,22 +1293,22 @@ that is followed by a Boolean expression. If this can be evaluated to
|
||||
true, the code in the body is executed. The loop is closed with an
|
||||
\code{end}.
|
||||
|
||||
\begin{lstlisting}[caption={Basic structure of a \code{while} loop.}, label=whileloop]
|
||||
\begin{lstlisting}[caption={Basic structure of a \varcode{while} loop.}, label=whileloop]
|
||||
while x == true % head with a Boolean expression
|
||||
% execute this code if the expression yields true
|
||||
end
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{exercise}{factorialWhileLoop.m}{}
|
||||
Implement the factorial of a number \varcode{n} using a \code{while}
|
||||
-- loop.
|
||||
Implement the factorial of a number \varcode{n} using a \varcode{while}-loop.
|
||||
\end{exercise}
|
||||
|
||||
|
||||
\begin{exercise}{neverendingWhile.m}{}
|
||||
Implement a \code{while}--loop that is never-ending. Hint: the body
|
||||
is executed as long as the Boolean expression in the head is
|
||||
\code{true}. You can escape the loop by pressing \keycode{Ctrl+C}.
|
||||
Implement a \varcode{while}-loop that is never-ending. Hint: the
|
||||
body is executed as long as the Boolean expression in the head is
|
||||
\varcode{true}. You can escape the loop by pressing
|
||||
\keycode{Ctrl+C}.
|
||||
\end{exercise}
|
||||
|
||||
|
||||
@ -1312,15 +1316,15 @@ end
|
||||
|
||||
\begin{itemize}
|
||||
\item Both execute the code in the body iterative.
|
||||
\item When using a \code{for} -- loop the body of the loop is executed
|
||||
\item When using a \code{for}-loop the body of the loop is executed
|
||||
at least once (except when the vector used in the head is empty).
|
||||
\item In a \code{while} -- loop, the body is not necessarily
|
||||
\item In a \code{while}-loop, the body is not necessarily
|
||||
executed. It is entered only if the Boolean expression in the head
|
||||
yields true.
|
||||
\item The \code{for} -- loop is best suited for cases in which the
|
||||
\item The \code{for}-loop is best suited for cases in which the
|
||||
elements of a vector have to be used for a computation or when the
|
||||
number of iterations is known.
|
||||
\item The \code{while} -- loop is best suited for cases when it is not
|
||||
\item The \code{while}-loop is best suited for cases when it is not
|
||||
known in advance how often a certain piece of code has to be
|
||||
executed.
|
||||
\item Any problem that can be solved with one type can also be solve
|
||||
@ -1336,8 +1340,8 @@ is only executed under a certain condition.
|
||||
\subsubsection{The \varcode{if} -- statement}
|
||||
|
||||
The most prominent representative of the conditional expressions is
|
||||
the \code{if} statement (sometimes also called \code{if - else}
|
||||
statement). It constitutes a kind of branching point. It allows to
|
||||
the \codeterm{if statement} (sometimes also called \codeterm{if - else
|
||||
statement}). It constitutes a kind of branching point. It allows to
|
||||
control which branch of the code is executed.
|
||||
|
||||
Again, the statement consists of the head and the body. The head
|
||||
@ -1346,11 +1350,11 @@ that controls whether or not the body is entered. Optionally, the body
|
||||
can be either ended by the \code{end} keyword or followed by
|
||||
additional statements \code{elseif}, which allows to add another
|
||||
Boolean expression and to catch another condition or the \code{else}
|
||||
the provide a default case. The last body of the \code{if - elseif -
|
||||
the provide a default case. The last body of the \varcode{if - elseif -
|
||||
else} statement has to be finished with the \code{end}
|
||||
(listing~\ref{ifelselisting}).
|
||||
|
||||
\begin{lstlisting}[label=ifelselisting, caption={Structure of an \code{if} statement.}]
|
||||
\begin{lstlisting}[label=ifelselisting, caption={Structure of an \varcode{if} statement.}]
|
||||
if x < y % head
|
||||
% body I, executed only if x < y
|
||||
elseif x > y
|
||||
@ -1361,7 +1365,7 @@ end
|
||||
\end{lstlisting}
|
||||
|
||||
\begin{exercise}{ifelse.m}{}
|
||||
Draw a random number and check with an appropriate \code{if}
|
||||
Draw a random number and check with an appropriate \varcode{if}
|
||||
statement whether it is
|
||||
\begin{enumerate}
|
||||
\item less than 0.5.
|
||||
@ -1373,9 +1377,9 @@ end
|
||||
|
||||
\subsubsection{The \varcode{switch} -- statement}
|
||||
|
||||
The \code{switch} statement is used whenever a set of conditions
|
||||
The \codeterm{switch statement} is used whenever a set of conditions
|
||||
requires separate treatment. The statement is initialized with the
|
||||
\code{switch} keyword that is followed by \emph{switch expression} (a
|
||||
\code{switch} keyword that is followed by a \emph{switch expression} (a
|
||||
number or string). It is followed by a set of \emph{case expressions}
|
||||
which start with the keyword \code{case} followed by the condition
|
||||
that defines against which the \emph{switch expression} is tested. It
|
||||
@ -1412,7 +1416,7 @@ end
|
||||
\end{itemize}
|
||||
|
||||
|
||||
\subsection{The keywords \code{break} and \code{continue}}
|
||||
\subsection{The keywords \varcode{break} and \varcode{continue}}
|
||||
|
||||
Whenever the execution of a loop should be ended or if you want to
|
||||
skip the execution of the body under certain circumstances, one can
|
||||
@ -1458,7 +1462,7 @@ end
|
||||
has passed between the calls of \code{tic} and \code{toc}.
|
||||
|
||||
\begin{enumerate}
|
||||
\item Use a \code{for} loop to select matching values.
|
||||
\item Use a \varcode{for} loop to select matching values.
|
||||
\item Use logical indexing.
|
||||
\end{enumerate}
|
||||
\end{exercise}
|
||||
@ -1486,12 +1490,12 @@ and executed line-by-line from top to bottom.
|
||||
|
||||
\matlab{} knows three types of programs:
|
||||
\begin{enumerate}
|
||||
\item \codeterm[Script]{Scripts}
|
||||
\item \codeterm[Function]{Functions}
|
||||
\item \codeterm[Object]{Objects} (not covered here)
|
||||
\item \entermde[script]{Skripte}{Scripts}
|
||||
\item \entermde[function]{Funktion}{Functions}
|
||||
\item \entermde[Object]{Objekte}{Objects} (not covered here)
|
||||
\end{enumerate}
|
||||
|
||||
Programs are stored in so called \codeterm{m-files}
|
||||
Programs are stored in so called \codeterm[m-file]{m-files}
|
||||
(e.g. \file{myProgram.m}). To use them they have to be \emph{called}
|
||||
from the command line or from within another program. Storing your code in
|
||||
programs increases the re-usability. So far we have used
|
||||
@ -1507,13 +1511,13 @@ and if it now wants to read the previously stored variable, it will
|
||||
contain a different value than expected. Bugs like this are hard to
|
||||
find since each of the programs alone is perfectly fine and works as
|
||||
intended. A solution for this problem are the
|
||||
\codeterm[Function]{functions}.
|
||||
\entermde[function]{Funktion}{functions}.
|
||||
|
||||
\subsection{Functions}
|
||||
|
||||
Functions in \matlab{} are similar to mathematical functions
|
||||
\[ y = f(x) \] Here, the mathematical function has the name $f$ and it
|
||||
has one \codeterm{argument} $x$ that is transformed into the
|
||||
has one \entermde{Argument}{argument} $x$ that is transformed into the
|
||||
function's output value $y$. In \matlab{} the syntax of a function
|
||||
declaration is very similar (listing~\ref{functiondefinitionlisting}).
|
||||
|
||||
@ -1524,12 +1528,12 @@ function [y] = functionName(arg_1, arg_2)
|
||||
\end{lstlisting}
|
||||
|
||||
The keyword \code{function} is followed by the return value(s) (it can
|
||||
be a list \code{[]} of values), the function name and the
|
||||
be a list \varcode{[]} of values), the function name and the
|
||||
argument(s). The function head is then followed by the function's
|
||||
body. A function is ended by and \code{end} (this is in fact optional
|
||||
but we will stick to this). Each function that should be directly used
|
||||
by the user (or called from other programs) should reside in an
|
||||
individual \code{m-file} that has the same name as the function. By
|
||||
individual \codeterm{m-file} that has the same name as the function. By
|
||||
using functions instead of scripts we gain several advantages:
|
||||
\begin{itemize}
|
||||
\item Encapsulation of program code that solves a certain task. It can
|
||||
@ -1566,10 +1570,9 @@ function myFirstFunction() % function head
|
||||
end
|
||||
\end{lstlisting}
|
||||
|
||||
\code{myFirstFunction} (listing~\ref{badsinewavelisting}) is a
|
||||
\varcode{myFirstFunction} (listing~\ref{badsinewavelisting}) is a
|
||||
prime-example of a bad function. There are several issues with it's
|
||||
design:
|
||||
|
||||
\begin{itemize}
|
||||
\item The function's name does not tell anything about it's purpose.
|
||||
\item The function is made for exactly one use-case (frequency of
|
||||
@ -1594,7 +1597,7 @@ defined:
|
||||
(e.g. the user of another program that calls a function)?
|
||||
\end{enumerate}
|
||||
|
||||
As indicated above the \code{myFirstFunction} does three things at
|
||||
As indicated above the \varcode{myFirstFunction} does three things at
|
||||
once, it seems natural, that the task should be split up into three
|
||||
parts. (i) Calculation of the individual sine waves defined by the
|
||||
frequency and the amplitudes (ii) graphical display of the data and
|
||||
@ -1607,17 +1610,18 @@ define (i) how to name the function, (ii) which information it needs
|
||||
(arguments), and (iii) what it should return to the caller.
|
||||
|
||||
\begin{enumerate}
|
||||
\item \codeterm[Function!Name]{Name}: the name should be descriptive
|
||||
\item \entermde[function!name]{Funktion!-sname}{Name}: the name should be descriptive
|
||||
of the function's purpose, i.e. the calculation of a sine wave. A
|
||||
appropriate name might be \code{sinewave()}.
|
||||
\item \codeterm[Function!Arguments]{Arguments}: What information does
|
||||
the function need to do the calculation? There are obviously the
|
||||
frequency as well as the amplitude. Further we may want to be able
|
||||
to define the duration of the sine wave and the temporal
|
||||
resolution. We thus need four arguments which should also named to
|
||||
describe their content: \code{amplitude, frequency, t\_max,} and
|
||||
\code{t\_step} might be good names.
|
||||
\item \codeterm[Function!Return values]{Return values}: For a correct
|
||||
appropriate name might be \varcode{sinewave()}.
|
||||
\item \entermde[function!arguments]{Funktion!-sargument}{Arguments}:
|
||||
What information does the function need to do the calculation? There
|
||||
are obviously the frequency as well as the amplitude. Further we may
|
||||
want to be able to define the duration of the sine wave and the
|
||||
temporal resolution. We thus need four arguments which should also
|
||||
named to describe their content: \varcode{amplitude},
|
||||
\varcode{frequency}, \varcode{t\_max}, and \varcode{t\_step} might
|
||||
be good names.
|
||||
\item \entermde[function!return values]{Funktion!R\"uckgabewerte}{Return values}: For a correct
|
||||
display of the data we need two vectors. The time, and the sine wave
|
||||
itself. We just need two return values: \varcode{time}, \varcode{sine}
|
||||
\end{enumerate}
|
||||
@ -1657,11 +1661,11 @@ specification of the function:
|
||||
|
||||
\begin{enumerate}
|
||||
\item It should plot a single sine wave. But it is not limited to sine
|
||||
waves. It's name is thus: \code{plotFunction()}.
|
||||
waves. It's name is thus: \varcode{plotFunction()}.
|
||||
\item What information does it need to solve the task? The
|
||||
to-be-plotted data as there is the values \code{y\_data} and the
|
||||
corresponding \code{x\_data}. As we want to plot series of sine
|
||||
waves we might want to have a \code{name} for each function to be
|
||||
to-be-plotted data as there is the values \varcode{y\_data} and the
|
||||
corresponding \varcode{x\_data}. As we want to plot series of sine
|
||||
waves we might want to have a \varcode{name} for each function to be
|
||||
displayed in the figure legend.
|
||||
\item Are there any return values? No, this function is just made for
|
||||
plotting, we do not need to return anything.
|
||||
@ -1699,11 +1703,11 @@ Again, we need to specify what needs to be done:
|
||||
appropriate name for the script (that is the name of the m-file)
|
||||
might be \file{plotMultipleSinewaves.m}.
|
||||
\item What information do we need? we need to define the
|
||||
\code{frequency}, the range of \code{amplitudes}, the
|
||||
\code{duration} of the sine waves, and the temporal resolution given
|
||||
as the time between to points in time, i.e. the \code{stepsize}.
|
||||
\varcode{frequency}, the range of \varcode{amplitudes}, the
|
||||
\varcode{duration} of the sine waves, and the temporal resolution given
|
||||
as the time between to points in time, i.e. the \varcode{stepsize}.
|
||||
\item We then need to create an empty figure, and work through the
|
||||
rang of \code{amplitudes}. We must not forget to switch \code{hold
|
||||
rang of \varcode{amplitudes}. We must not forget to switch \varcode{hold
|
||||
on} if we want to see all the sine waves in one plot.
|
||||
\end{enumerate}
|
||||
|
||||
|
@ -33,7 +33,7 @@ fitting approaches. We will apply this method to find the combination
|
||||
of slope and intercept that best describes the system.
|
||||
|
||||
|
||||
\section{The error function --- mean square error}
|
||||
\section{The error function --- mean squared error}
|
||||
|
||||
Before the optimization can be done we need to specify what is
|
||||
considered an optimal fit. In our example we search the parameter
|
||||
@ -57,25 +57,23 @@ $\sum_{i=1}^N |y_i - y^{est}_i|$. The total error can only be small if
|
||||
all deviations are indeed small no matter if they are above or below
|
||||
the prediced line. Instead of the sum we could also ask for the
|
||||
\emph{average}
|
||||
|
||||
\begin{equation}
|
||||
\label{meanabserror}
|
||||
f_{dist}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N |y_i - y^{est}_i|
|
||||
\end{equation}
|
||||
should be small. Commonly, the \enterm{mean squared distance} oder
|
||||
\enterm{mean squared error}
|
||||
\enterm[square error!mean]{mean square error} (\determ[quadratischer Fehler!mittlerer]{mittlerer quadratischer Fehler})
|
||||
\begin{equation}
|
||||
\label{meansquarederror}
|
||||
f_{mse}(\{(x_i, y_i)\}|\{y^{est}_i\}) = \frac{1}{N} \sum_{i=1}^N (y_i - y^{est}_i)^2
|
||||
\end{equation}
|
||||
|
||||
is used (\figref{leastsquareerrorfig}). Similar to the absolute
|
||||
distance, the square of the error($(y_i - y_i^{est})^2$) is always
|
||||
positive error values do not cancel out. The square further punishes
|
||||
large deviations.
|
||||
|
||||
\begin{exercise}{meanSquareError.m}{}\label{mseexercise}%
|
||||
Implement a function \code{meanSquareError()}, that calculates the
|
||||
Implement a function \varcode{meanSquareError()}, that calculates the
|
||||
\emph{mean square distance} between a vector of observations ($y$)
|
||||
and respective predictions ($y^{est}$).
|
||||
\end{exercise}
|
||||
@ -84,18 +82,19 @@ large deviations.
|
||||
\section{\tr{Objective function}{Zielfunktion}}
|
||||
|
||||
$f_{cost}(\{(x_i, y_i)\}|\{y^{est}_i\})$ is a so called
|
||||
\enterm{objective function} or \enterm{cost function}. We aim to adapt
|
||||
the model parameters to minimize the error (mean square error) and
|
||||
thus the \emph{objective function}. In Chapter~\ref{maximumlikelihoodchapter}
|
||||
we will show that the minimization of the mean square error is
|
||||
equivalent to maximizing the likelihood that the observations
|
||||
originate from the model (assuming a normal distribution of the data
|
||||
around the model prediction).
|
||||
\enterm{objective function} or \enterm{cost function}
|
||||
(\determ{Kostenfunktion}). We aim to adapt the model parameters to
|
||||
minimize the error (mean square error) and thus the \emph{objective
|
||||
function}. In Chapter~\ref{maximumlikelihoodchapter} we will show
|
||||
that the minimization of the mean square error is equivalent to
|
||||
maximizing the likelihood that the observations originate from the
|
||||
model (assuming a normal distribution of the data around the model
|
||||
prediction).
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics[width=1\textwidth]{linear_least_squares}
|
||||
\titlecaption{Estimating the \emph{mean square error}.} {The
|
||||
deviation (\enterm{error}, orange) between the prediction (red
|
||||
deviation error, orange) between the prediction (red
|
||||
line) and the observations (blue dots) is calculated for each data
|
||||
point (left). Then the deviations are squared and the aveage is
|
||||
calculated (right).}
|
||||
@ -119,11 +118,13 @@ Replacing $y^{est}$ with the linear equation (the model) in
|
||||
|
||||
That is, the mean square error is given the pairs $(x_i, y_i)$ and the
|
||||
parameters $m$ and $b$ of the linear equation. The optimization
|
||||
process will not try to optimize $m$ and $b$ to lead to the smallest
|
||||
error, the method of the \enterm{least square error}.
|
||||
process tries to optimize $m$ and $b$ such that the error is
|
||||
minimized, the method of the \enterm[square error!least]{least square
|
||||
error} (\determ[quadratischer Fehler!kleinster]{Methode der
|
||||
kleinsten Quadrate}).
|
||||
|
||||
\begin{exercise}{lsqError.m}{}
|
||||
Implement the objective function \code{lsqError()} that applies the
|
||||
Implement the objective function \varcode{lsqError()} that applies the
|
||||
linear equation as a model.
|
||||
\begin{itemize}
|
||||
\item The function takes three arguments. The first is a 2-element
|
||||
@ -131,7 +132,7 @@ error, the method of the \enterm{least square error}.
|
||||
\varcode{b}. The second is a vector of x-values the third contains
|
||||
the measurements for each value of $x$, the respecive $y$-values.
|
||||
\item The function returns the mean square error \eqnref{mseline}.
|
||||
\item The function should call the function \code{meanSquareError()}
|
||||
\item The function should call the function \varcode{meanSquareError()}
|
||||
defined in the previouos exercise to calculate the error.
|
||||
\end{itemize}
|
||||
\end{exercise}
|
||||
@ -165,7 +166,7 @@ third dimension is used to indicate the error value
|
||||
\varcode{y}). Implement a script \file{errorSurface.m}, that
|
||||
calculates the mean square error between data and a linear model and
|
||||
illustrates the error surface using the \code{surf()} function
|
||||
(consult the help to find out how to use \code{surf}.).
|
||||
(consult the help to find out how to use \code{surf()}.).
|
||||
\end{exercise}
|
||||
|
||||
By looking at the error surface we can directly see the position of
|
||||
@ -257,7 +258,7 @@ way to the minimum of the objective function. The ball will always
|
||||
follow the steepest slope. Thus we need to figure out the direction of
|
||||
the steepest slope at the position of the ball.
|
||||
|
||||
The \enterm{gradient} (Box~\ref{partialderivativebox}) of the
|
||||
The \entermde{Gradient}{gradient} (Box~\ref{partialderivativebox}) of the
|
||||
objective function is the vector
|
||||
|
||||
\[ \nabla f_{cost}(m,b) = \left( \frac{\partial f(m,b)}{\partial m},
|
||||
@ -296,7 +297,7 @@ choose the opposite direction.
|
||||
\end{figure}
|
||||
|
||||
\begin{exercise}{lsqGradient.m}{}\label{gradientexercise}%
|
||||
Implement a function \code{lsqGradient()}, that takes the set of
|
||||
Implement a function \varcode{lsqGradient()}, that takes the set of
|
||||
parameters $(m, b)$ of the linear equation as a two-element vector
|
||||
and the $x$- and $y$-data as input arguments. The function should
|
||||
return the gradient at that position.
|
||||
@ -316,8 +317,8 @@ choose the opposite direction.
|
||||
Finally, we are able to implement the optimization itself. By now it
|
||||
should be obvious why it is called the gradient descent method. All
|
||||
ingredients are already there. We need: 1. The error function
|
||||
(\code{meanSquareError}), 2. the objective function
|
||||
(\code{lsqError()}), and 3. the gradient (\code{lsqGradient()}). The
|
||||
(\varcode{meanSquareError}), 2. the objective function
|
||||
(\varcode{lsqError()}), and 3. the gradient (\varcode{lsqGradient()}). The
|
||||
algorithm of the gradient descent is:
|
||||
|
||||
\begin{enumerate}
|
||||
|
@ -9,6 +9,8 @@
|
||||
\include{#1/lecture/#1}%
|
||||
}
|
||||
|
||||
%\includeonly{regression/lecture/regression}
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\begin{document}
|
||||
|
@ -7,17 +7,16 @@ Descriptive statistics characterizes data sets by means of a few measures.
|
||||
In addition to histograms that estimate the full distribution of the data,
|
||||
the following measures are used for characterizing univariate data:
|
||||
\begin{description}
|
||||
\item[Location, central tendency] (``Lagema{\ss}e''):
|
||||
arithmetic mean, median, mode.
|
||||
\item[Spread, dispersion] (``Streuungsma{\ss}e''): variance,
|
||||
standard deviation, inter-quartile range,\linebreak coefficient of variation
|
||||
(``Variationskoeffizient'').
|
||||
\item[Shape]: skewness (``Schiefe''), kurtosis (``W\"olbung'').
|
||||
\item[Location, central tendency] (\determ{Lagema{\ss}e}):
|
||||
\entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean}, \entermde{Median}{median}, \enterm{mode}.
|
||||
\item[Spread, dispersion] (\determ{Streuungsma{\ss}e}): \entermde{Varianz}{variance},
|
||||
\entermde{Standardabweichung}{standard deviation}, inter-quartile range,\linebreak \enterm{coefficient of variation} (\determ{Variationskoeffizient}).
|
||||
\item[Shape]: \enterm{skewness} (\determ{Schiefe}), \enterm{kurtosis} (\determ{W\"olbung}).
|
||||
\end{description}
|
||||
For bivariate and multivariate data sets we can also analyse their
|
||||
\begin{description}
|
||||
\item[Dependence, association] (``Zusammenhangsma{\ss}e''): Pearson's correlation coefficient,
|
||||
Spearman's rank correlation coefficient.
|
||||
\item[Dependence, association] (\determ{Zusammenhangsma{\ss}e}): \entermde[correlation!coefficient!Pearson's]{Korrelation!Pearson}{Pearson's correlation coefficient},
|
||||
\entermde[correlation!coefficient!Spearman's rank]{{Rangkorrelationskoeffizient!Spearman'scher}}{Spearman's rank correlation coefficient}.
|
||||
\end{description}
|
||||
|
||||
The following is in no way a complete introduction to descriptive
|
||||
@ -26,15 +25,16 @@ daily data-analysis problems.
|
||||
|
||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\section{Mean, variance, and standard deviation}
|
||||
The \enterm{arithmetic mean} is a measure of location. For $n$ data values
|
||||
$x_i$ the arithmetic mean is computed by
|
||||
The \entermde[mean!arithmetic]{Mittel!arithmetisches}{arithmetic mean}
|
||||
is a measure of location. For $n$ data values $x_i$ the arithmetic
|
||||
mean is computed by
|
||||
\[ \bar x = \langle x \rangle = \frac{1}{N}\sum_{i=1}^n x_i \; . \]
|
||||
This computation (summing up all elements of a vector and dividing by
|
||||
the length of the vector) is provided by the function \mcode{mean()}.
|
||||
The mean has the same unit as the data values.
|
||||
|
||||
The dispersion of the data values around the mean is quantified by
|
||||
their \enterm{variance}
|
||||
their \entermde{Varianz}{variance}
|
||||
\[ \sigma^2_x = \langle (x-\langle x \rangle)^2 \rangle = \frac{1}{N}\sum_{i=1}^n (x_i - \bar x)^2 \; . \]
|
||||
The variance is computed by the function \mcode{var()}.
|
||||
The unit of the variance is the unit of the data values squared.
|
||||
@ -42,14 +42,15 @@ Therefore, variances cannot be compared to the mean or the data values
|
||||
themselves. In particular, variances cannot be used for plotting error
|
||||
bars along with the mean.
|
||||
|
||||
The standard deviation
|
||||
In contrast to the variance, the
|
||||
\entermde{Standardabweichung}{standard deviation}
|
||||
\[ \sigma_x = \sqrt{\sigma^2_x} \; , \]
|
||||
as computed by the function \mcode{std()}, however, has the same unit
|
||||
as the data values and can (and should) be used to display the
|
||||
dispersion of the data together with their mean.
|
||||
as computed by the function \mcode{std()} has the same unit as the
|
||||
data values and can (and should) be used to display the dispersion of
|
||||
the data together with their mean.
|
||||
|
||||
The mean of a data set can be displayed by a bar-plot
|
||||
\matlabfun{bar()}. Additional errorbars \matlabfun{errobar()} can be
|
||||
\matlabfun{bar()}. Additional errorbars \matlabfun{errorbar()} can be
|
||||
used to illustrate the standard deviation of the data
|
||||
(\figref{displayunivariatedatafig} (2)).
|
||||
|
||||
@ -90,18 +91,18 @@ used to illustrate the standard deviation of the data
|
||||
identical with the mode.}
|
||||
\end{figure}
|
||||
|
||||
The \enterm{mode} is the most frequent value, i.e. the position of the maximum of the probability distribution.
|
||||
The \enterm{mode} (\determ{Modus}) is the most frequent value,
|
||||
i.e. the position of the maximum of the probability distribution.
|
||||
|
||||
The \enterm{median} separates a list of data values into two halves
|
||||
such that one half of the data is not greater and the other half is
|
||||
not smaller than the median (\figref{medianfig}).
|
||||
The \entermde{Median}{median} separates a list of data values into two
|
||||
halves such that one half of the data is not greater and the other
|
||||
half is not smaller than the median (\figref{medianfig}). The
|
||||
function \mcode{median()} computes the median.
|
||||
|
||||
\begin{exercise}{mymedian.m}{}
|
||||
Write a function \varcode{mymedian()} that computes the median of a vector.
|
||||
\end{exercise}
|
||||
|
||||
\matlab{} provides the function \code{median()} for computing the median.
|
||||
|
||||
\begin{exercise}{checkmymedian.m}{}
|
||||
Write a script that tests whether your median function really
|
||||
returns a median above which are the same number of data than
|
||||
@ -122,9 +123,9 @@ not smaller than the median (\figref{medianfig}).
|
||||
\end{figure}
|
||||
|
||||
The distribution of data can be further characterized by the position
|
||||
of its \enterm[quartile]{quartiles}. Neighboring quartiles are
|
||||
of its \entermde[quartile]{Quartil}{quartiles}. Neighboring quartiles are
|
||||
separated by 25\,\% of the data (\figref{quartilefig}).
|
||||
\enterm[percentile]{Percentiles} allow to characterize the
|
||||
\entermde[percentile]{Perzentil}{Percentiles} allow to characterize the
|
||||
distribution of the data in more detail. The 3$^{\rm rd}$ quartile
|
||||
corresponds to the 75$^{\rm th}$ percentile, because 75\,\% of the
|
||||
data are smaller than the 3$^{\rm rd}$ quartile.
|
||||
@ -147,11 +148,12 @@ data are smaller than the 3$^{\rm rd}$ quartile.
|
||||
% from a normal distribution.}
|
||||
% \end{figure}
|
||||
|
||||
\enterm[box-whisker plots]{Box-whisker plots} are commonly used to
|
||||
visualize and compare the distribution of unimodal data. A box is
|
||||
drawn around the median that extends from the 1$^{\rm st}$ to the
|
||||
3$^{\rm rd}$ quartile. The whiskers mark the minimum and maximum value
|
||||
of the data set (\figref{displayunivariatedatafig} (3)).
|
||||
\entermde[box-whisker plots]{Box-Whisker-Plot}{Box-whisker plots}, or
|
||||
\entermde{Box-Plot}{box plot} are commonly used to visualize and
|
||||
compare the distribution of unimodal data. A box is drawn around the
|
||||
median that extends from the 1$^{\rm st}$ to the 3$^{\rm rd}$
|
||||
quartile. The whiskers mark the minimum and maximum value of the data
|
||||
set (\figref{displayunivariatedatafig} (3)).
|
||||
|
||||
\begin{exercise}{univariatedata.m}{}
|
||||
Generate 40 normally distributed random numbers with a mean of 2 and
|
||||
@ -170,13 +172,14 @@ of the data set (\figref{displayunivariatedatafig} (3)).
|
||||
% \end{exercise}
|
||||
|
||||
\section{Distributions}
|
||||
The distribution of values in a data set is estimated by histograms
|
||||
(\figref{displayunivariatedatafig} (4)).
|
||||
The \enterm{distribution} (\determ{Verteilung}) of values in a data
|
||||
set is estimated by histograms (\figref{displayunivariatedatafig}
|
||||
(4)).
|
||||
|
||||
\subsection{Histograms}
|
||||
|
||||
\enterm[histogram]{Histograms} count the frequency $n_i$ of
|
||||
$N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$
|
||||
\entermde[histogram]{Histogramm}{Histograms} count the frequency $n_i$
|
||||
of $N=\sum_{i=1}^M n_i$ measurements in each of $M$ bins $i$
|
||||
(\figref{diehistogramsfig} left). The bins tile the data range
|
||||
usually into intervals of the same size. The width of the bins is
|
||||
called the bin width. The frequencies $n_i$ plotted against the
|
||||
@ -194,13 +197,14 @@ categories $i$ is the \enterm{histogram}, or the \enterm{frequency
|
||||
\end{figure}
|
||||
|
||||
Histograms are often used to estimate the
|
||||
\enterm[probability!distribution]{probability distribution} of the
|
||||
data values.
|
||||
\enterm[probability!distribution]{probability distribution}
|
||||
(\determ[Wahrscheinlichkeits!-verteilung]{Wahrscheinlichkeitsverteilung}) of the data values.
|
||||
|
||||
\subsection{Probabilities}
|
||||
In the frequentist interpretation of probability, the probability of
|
||||
an event (e.g. getting a six when rolling a die) is the relative
|
||||
occurrence of this event in the limit of a large number of trials.
|
||||
In the frequentist interpretation of probability, the
|
||||
\enterm{probability} (\determ{Wahrscheinlichkeit}) of an event
|
||||
(e.g. getting a six when rolling a die) is the relative occurrence of
|
||||
this event in the limit of a large number of trials.
|
||||
|
||||
For a finite number of trials $N$ where the event $i$ occurred $n_i$
|
||||
times, the probability $P_i$ of this event is estimated by
|
||||
@ -212,15 +216,16 @@ the sum of the probabilities of all possible events is one:
|
||||
i.e. the probability of getting any event is one.
|
||||
|
||||
|
||||
\subsection{Probability distributions of categorial data}
|
||||
\subsection{Probability distributions of categorical data}
|
||||
|
||||
For categorial data values (e.g. the faces of a die (as integer
|
||||
numbers or as colors)) a bin can be defined for each category $i$.
|
||||
The histogram is normalized by the total number of measurements to
|
||||
make it independent of the size of the data set
|
||||
(\figref{diehistogramsfig}). After this normalization the height of
|
||||
each histogram bar is an estimate of the probability $P_i$ of the
|
||||
category $i$, i.e. of getting a data value in the $i$-th bin.
|
||||
For \entermde[data!categorical]{Daten!kategorische}{categorical} data
|
||||
values (e.g. the faces of a die (as integer numbers or as colors)) a
|
||||
bin can be defined for each category $i$. The histogram is normalized
|
||||
by the total number of measurements to make it independent of the size
|
||||
of the data set (\figref{diehistogramsfig}). After this normalization
|
||||
the height of each histogram bar is an estimate of the probability
|
||||
$P_i$ of the category $i$, i.e. of getting a data value in the $i$-th
|
||||
bin.
|
||||
|
||||
\begin{exercise}{rollthedie.m}{}
|
||||
Write a function that simulates rolling a die $n$ times.
|
||||
@ -236,12 +241,14 @@ category $i$, i.e. of getting a data value in the $i$-th bin.
|
||||
|
||||
\subsection{Probability densities functions}
|
||||
|
||||
In cases where we deal with data sets of measurements of a real
|
||||
quantity (e.g. lengths of snakes, weights of elephants, times
|
||||
between succeeding spikes) there is no natural bin width for computing
|
||||
a histogram. In addition, the probability of measuring a data value that
|
||||
equals exactly a specific real number like, e.g., 0.123456789 is zero, because
|
||||
there are uncountable many real numbers.
|
||||
In cases where we deal with
|
||||
\entermde[data!continuous]{Daten!kontinuierliche}{continuous data},
|
||||
(measurements of real-valued quantities, e.g. lengths of snakes,
|
||||
weights of elephants, times between succeeding spikes) there is no
|
||||
natural bin width for computing a histogram. In addition, the
|
||||
probability of measuring a data value that equals exactly a specific
|
||||
real number like, e.g., 0.123456789 is zero, because there are
|
||||
uncountable many real numbers.
|
||||
|
||||
We can only ask for the probability to get a measurement value in some
|
||||
range. For example, we can ask for the probability $P(1.2<x<1.3)$ to
|
||||
@ -254,14 +261,14 @@ probability can also be expressed as $P(x_0<x<x_0 + \Delta x)$.
|
||||
In the limit to very small ranges $\Delta x$ the probability of
|
||||
getting a measurement between $x_0$ and $x_0+\Delta x$ scales down to
|
||||
zero with $\Delta x$:
|
||||
\[ P(x_0<x<x_0+\Delta x) \approx p(x_0) \cdot \Delta x \; . \]
|
||||
In here the quantity $p(x_00)$ is a so called
|
||||
\enterm[probability!density]{probability density} that is larger than
|
||||
zero and that describes the distribution of the data values. The
|
||||
probability density is not a unitless probability with values between
|
||||
0 and 1, but a number that takes on any positive real number and has
|
||||
as a unit the inverse of the unit of the data values --- hence the
|
||||
name ``density''.
|
||||
\[ P(x_0<x<x_0+\Delta x) \approx p(x_0) \cdot \Delta x \; . \] In here
|
||||
the quantity $p(x_00)$ is a so called
|
||||
\enterm[probability!density]{probability density}
|
||||
(\determ[Wahrscheinlichkeits!-dichte]{Wahrscheinlichkeitsdichte}) that is larger than zero and that
|
||||
describes the distribution of the data values. The probability density
|
||||
is not a unitless probability with values between 0 and 1, but a
|
||||
number that takes on any positive real number and has as a unit the
|
||||
inverse of the unit of the data values --- hence the name ``density''.
|
||||
|
||||
\begin{figure}[t]
|
||||
\includegraphics[width=1\textwidth]{pdfprobabilities}
|
||||
@ -282,17 +289,18 @@ the probability density over the whole real axis must be one:
|
||||
\end{equation}
|
||||
|
||||
The function $p(x)$, that assigns to every $x$ a probability density,
|
||||
is called \enterm[probability!density function]{probability density function},
|
||||
\enterm[pdf|see{probability density function}]{pdf}, or just
|
||||
\enterm[density|see{probability density function}]{density}
|
||||
(\determ{Wahrscheinlichkeitsdichtefunktion}). The well known
|
||||
\enterm{normal distribution} (\determ{Normalverteilung}) is an example of a
|
||||
probability density function
|
||||
is called \enterm[probability!density function]{probability density
|
||||
function}, \enterm[pdf|see{probability density function}]{pdf}, or
|
||||
just \enterm[density|see{probability density function}]{density}
|
||||
(\determ[Wahrscheinlichkeits!-dichtefunktion]{Wahrscheinlichkeitsdichtefunktion},
|
||||
\determ[Wahrscheinlichkeits!-dichte]{Wahrscheinlichkeitsdichte}). The
|
||||
well known \entermde{Normalverteilung}{normal distribution} is an
|
||||
example of a probability density function
|
||||
\[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}} \]
|
||||
--- the \enterm{Gaussian distribution}
|
||||
(\determ{Gau{\ss}sche-Glockenkurve}) with mean $\mu$ and standard
|
||||
deviation $\sigma$.
|
||||
The factor in front of the exponential function ensures the normalization to
|
||||
The factor in front of the exponential function ensures normalization to
|
||||
$\int p_g(x) \, dx = 1$, \eqnref{pdfnorm}.
|
||||
|
||||
\begin{exercise}{gaussianpdf.m}{gaussianpdf.out}
|
||||
@ -322,13 +330,15 @@ values fall within each bin (\figref{pdfhistogramfig} left).
|
||||
|
||||
To turn such histograms to estimates of probability densities they
|
||||
need to be normalized such that according to \eqnref{pdfnorm} their
|
||||
integral equals one. While histograms of categorial data are
|
||||
integral equals one. While histograms of categorical data are
|
||||
normalized such that their sum equals one, here we need to integrate
|
||||
over the histogram. The integral is the area (not the height) of the
|
||||
histogram bars. Each bar has the height $n_i$ and the width $\Delta
|
||||
x$. The total area $A$ of the histogram is thus
|
||||
\[ A = \sum_{i=1}^N ( n_i \cdot \Delta x ) = \Delta x \sum_{i=1}^N n_i = N \, \Delta x \]
|
||||
and the normalized histogram has the heights
|
||||
and the
|
||||
\entermde[histogram!normalized]{Histogramm!normiertes}{normalized
|
||||
histogram} has the heights
|
||||
\[ p(x_i) = \frac{n_i}{A} = \frac{n_i}{\Delta x \sum_{i=1}^N n_i} =
|
||||
\frac{n_i}{N \Delta x} \; .\]
|
||||
A histogram needs to be divided by both the sum of the frequencies
|
||||
@ -375,14 +385,14 @@ shape histogram depends on the exact position of its bins
|
||||
(here Gaussian kernels with standard deviation of $\sigma=0.2$).}
|
||||
\end{figure}
|
||||
|
||||
To avoid this problem one can use so called \enterm{kernel densities}
|
||||
for estimating probability densities from data. Here every data point
|
||||
is replaced by a kernel (a function with integral one, like for
|
||||
example the Gaussian) that is moved exactly to the position
|
||||
indicated by the data value. Then all the kernels of all the data
|
||||
values are summed up, the sum is divided by the number of data values,
|
||||
and we get an estimate of the probability density
|
||||
(\figref{kerneldensityfig} right).
|
||||
To avoid this problem so called \entermde[kernel
|
||||
density]{Kerndichte}{kernel densities} can be used for estimating
|
||||
probability densities from data. Here every data point is replaced by
|
||||
a kernel (a function with integral one, like for example the Gaussian)
|
||||
that is moved exactly to the position indicated by the data
|
||||
value. Then all the kernels of all the data values are summed up, the
|
||||
sum is divided by the number of data values, and we get an estimate of
|
||||
the probability density (\figref{kerneldensityfig} right).
|
||||
|
||||
As for the histogram, where we need to choose a bin width, we need to
|
||||
choose the width of the kernels appropriately.
|
||||
@ -457,7 +467,9 @@ bivariate or multivariate data sets where we have pairs or tuples of
|
||||
data values (e.g. size and weight of elephants) we want to analyze
|
||||
dependencies between the variables.
|
||||
|
||||
The \enterm[correlation!correlation coefficient]{correlation coefficient}
|
||||
The
|
||||
\entermde[correlation!coefficient]{Korrelation!-skoeffizient}{correlation
|
||||
coefficient}
|
||||
\begin{equation}
|
||||
\label{correlationcoefficient}
|
||||
r_{x,y} = \frac{Cov(x,y)}{\sigma_x \sigma_y} = \frac{\langle
|
||||
@ -467,8 +479,8 @@ The \enterm[correlation!correlation coefficient]{correlation coefficient}
|
||||
\end{equation}
|
||||
quantifies linear relationships between two variables
|
||||
\matlabfun{corr()}. The correlation coefficient is the
|
||||
\enterm{covariance} normalized by the standard deviations of the
|
||||
single variables. Perfectly correlated variables result in a
|
||||
\entermde{Kovarianz}{covariance} normalized by the standard deviations
|
||||
of the single variables. Perfectly correlated variables result in a
|
||||
correlation coefficient of $+1$, anit-correlated or negatively
|
||||
correlated data in a correlation coefficient of $-1$ and un-correlated
|
||||
data in a correlation coefficient close to zero
|
||||
|
Reference in New Issue
Block a user