This repository has been archived on 2021-05-17. You can view files and clone it, but cannot push or open issues or pull requests.
scientificComputing/programming/lecture/programming.tex

1668 lines
64 KiB
TeX

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{Programming in \matlab}\label{programming}
In this chapter we will cover the basics of programming in
\matlab{}. Starting with the concept of simple variables and data
types, we introduce basic data structures, such as vecotrs and
matrices and show how one can work with them. Then we will address the
structures used to control the flow of the program and how to write
scripts and functions. Only a few of the concepts discussed in this
chapter are really \matlab{}-specific. Despite a few language-specific
details the same structures are found in most other programming
languages. Switching to another language (e.g. Python) is rather
simple.
\section{Variables and data types}
The ultimate goals of scientific computing is to analyze gathered
data, correlate it with e.g. stimulus conditions and infer rules and
dependencies. These may be used to constrain model that allow us to
understand and predict a system's behavior. In order to work with data
we need to store it somehow. For this purpose we use \emph{variables}
that allow us to store the data and give it a name for easy recognition
and to provide semantic meaning.
\subsection{Variables}
Technically speaking, a \enterm{variable} is a pointer to a certain address in the
computer's memory. This pointer is characterized by it's name and the
\enterm{data type} (figure~\ref{variablefig}). In the computer's
memory the value of the variable is stored in binary form, that is, as
a sequence of zeros and ones (\enterm[bit]{bits}). When the variable
is read from memory, this binary pattern is interpreted according
to the data type. \Figref{variablefig} shows
that the very same binary pattern is either interpreted as an 8-bit
integer type (numeric value 38) or as the ampersand (\&) character. In
\matlab{} data types are of only minor importance but there are
occasions in which it becomes important to know the data type of a
variable.
\begin{figure}
\centering
\begin{subfigure}{.5\textwidth}
\includegraphics[width=0.8\textwidth]{variable}\label{variable:a}
\end{subfigure}%
\begin{subfigure}{.5\textwidth}
\includegraphics[width=.8\textwidth]{variableB}\label{variable:b}
\end{subfigure}
\titlecaption{Variables}{ point to a memory
address. They further are described by their name and
data type. The variable's value is stored as a pattern of binary
values (0 or 1). When reading the variable this pattern is
interpreted according to the variable's
data type.}\label{variablefig}
\end{figure}
\subsection{Creating variables}
In \matlab{} variables can be created at any time on the command line
or any place in a script or function. Listing~\ref{varListing1} shows
three different ways of creating a variable:
\begin{lstlisting}[label=varListing1, caption={Creating variables.}]
>> x = 38
x =
38
>> y = []
y =
[]
>> z = 'A'
z =
A
\end{lstlisting}
Line 1 can be read like: ``Create a variable with the name \varcode{x}
and assign the value 38''. The equality sign is the so called
\codeterm{assignment operator}. Line 5 defines a variable \varcode{y}
and assigns an empty value. If not explicitly specified \matlab{}
variables will have the \codeterm{double} (a numeric data type, see
below) data type. In line 9, however, we create a variable \varcode{z}
and assign the character ``A'' to it. Accordingly, \varcode{z} does
not have the numeric \codeterm{double} data type but is of the type
\codeterm{character}. \textbf{Note:} \matlab{} uses single quotes for characters or strings of characters.
There are two ways to find out the actual data type of a variable: the
\code{class()} and the \code{whos} functions. While \code{class()}
simply returns the data type, \code{whos} offers more detailed
information but it is not suited to be used in programs (see
also the \code{who} function that returns a list of all defined
variables, listing~\ref{varListing2}).
\begin{lstlisting}[label=varListing2, caption={Requesting information about defined variables and their types.}]
>>class(x)
ans =
double
>> who
Your variables are:
x y z
>> whos
Name Size Bytes Class Attributes
x 1x1 8 double
y 0x0 0 double
z 1x1 2 char
\end{lstlisting}
\begin{important}[Naming conventions]
There are a few rules regarding variable names. \matlab{} is
case-sensitive, i.e. \code{x} and \code{X} are two different
names. Names must begin with an alphabetic character. German (or
other) umlauts, special characters and spaces are forbidden in
variable names.
\end{important}
\subsection{Working with variables}
We can certainly work, i.e. do calculations, with variables. \matlab{}
knows all basic \codeterm[Operator!arithmetic]{arithmetic operators}
such as \code[Operator!arithmetic!1add@+]{+},
\code[Operator!arithmetic!2sub@-]{-},
\code[Operator!arithmetic!3mul@*]{*} and
\code[Operator!arithmetic!4div@/]{/}. The power is denoted by the
\code[Operator!arithmetic!5pow@\^{}]{\^{}}. Listing~\ref{varListing3}
shows their use.
\pagebreak[4]
\begin{lstlisting}[label=varListing3, caption={Working with variables.}]
>> x = 1;
>> x + 10
ans =
11
>> x % x has not changed!
ans =
1
>> y = 2;
>> x + y
ans =
3
>> z = x + y
z =
3
>> z = z * 5;
>> z
z =
15
>> clear z % deleting a variable
\end{lstlisting}
Note: in lines 2 and 10 the variables have been used without changing
their values. Whenever the value of a variable should change, the
\code[Operator!Assignment!=]{=} operator must be used (lines 14 and
18). Line 23, finally shows how to delete a variable.
\subsection{Data types}
As mentioned above, the data type associated with a variable defines how the stored bit pattern is interpreted. The major data types are:
\begin{itemize}
\item \codeterm{integer}: Integer numbers. There are several subtypes
which, for most use-cases, can be ignored when working in \matlab{}.
\item \codeterm{single} \codeterm{double}: Floating point numbers. In
contrast to the real numbers that are represented with this data
type the number of numeric values that can be represented is limited
(countable).
\item \codeterm{complex}: Complex numbers having a real and imaginary
part.
\item \codeterm{logical}: Boolean values that can be evaluated to
\code{true} or \code{false}.
\item \codeterm{char}: ASCII characters.
\end{itemize}
There is a variety of numeric data types that require different
amounts of memory and have different ranges of values that can be
represented (table~\ref{dtypestab}).
\begin{table}[t]
\centering
\titlecaption{Numeric data types and their ranges.}{}
\label{dtypestab}
\begin{tabular}{llcl}\hline
Data type & memory demand & range & example \erh \\ \hline
\code{single} & 32 bit & $\approx -3.4^{38}$ to $\approx 3.4^{38}$ & Floating point numbers.\erb \\
\code{double} & 64 bit & $\approx -10^{308}$ to $\approx 10^{308}$ &
Floating point numbers.\erb\\ \code{int} & 64 bit & $-2^{31}$
to $2^{31}-1$ & Integer values. \\ \code{int16} & 16 bit &
$-2^{15}$ to $2^{15}-1$ & Digitizes measurements. \\ \code{uint8}
& 8 bit & $0$ bis $255$ & Digitized intensities of colors in
images. \\ \hline
\end{tabular}
\end{table}
By default \matlab{} uses the \codeterm{double} data type whenever
numerical values are stored. Nevertheless, there are use-cases
in which different data types are better suited. Box~\ref{daqbox}
exemplifies one of such cases.
\begin{ibox}[t]{\label{daqbox}Digitizing measurements}
Scenario: The electric activity (e.g. the membrane potential) of a
nerve cell is recorded. The measurements are digitized and stored on
the hard disk of a computer for later analysis. This is done using a
Data Acquisition (DAQ) system that converts the analog measurements
into computer digestible digital format. Typically these systems
have a working range of $\pm 10$\,V. This range is usually resolved
with a precision of 16 bit. This means that the full potential range
is mapped onto $2^{16}$ digital values.\vspace{0.25cm}
\begin{minipage}{0.5\textwidth}
\includegraphics[width=0.9\columnwidth]{data_acquisition}
\end{minipage}
\begin{minipage}{0.5\textwidth}
Mapping of the potential range onto a \code{int16} data type:
\[ y = x \cdot 2^{16}/20\] with $x$ being the measured potential and $y$
the digitized value at a potential range of $\pm10$\,V and a
resolution of 16 bit. Resulting values are integer numbers in the
range $-2^{15}=-32768$ to $2^{15}-1 = 32767$.
The measured potential can be calculated from the digitized value
by inverting the equation:
\[ x = y \cdot 20/2^{16} \]
\end{minipage}\vspace{0.25cm}
In this context it is most efficient to store the measured values as
\code{int16} instead of \code{double} numbers. Storing floating
point numbers requires four times more memory (8 instead of 2
\codeterm{Byte}, 64 instead of 16 bit) and offers no additional
information.
\end{ibox}
\section{Vectors and matrices}
Vectors and matrices are the most important data structures in
\matlab{}. In other programming languages there is no distinction
between theses structures, they are one- or multidimensional
\enterm{arrays}. Such arrays are structures that can store multiple
values of the same data type in a single variable. Due to \matlab{}'s
origin in the handling of mathematical problems, they are called
differently but are internally the same. Vectors are 2-dimensional
matrices in which one dimension has the size 1 (a singleton
dimension).
\subsection{Vectors}
In contrast to variables that store just a single value
(\enterm{scalar}) a vector can store multiple values of the same data
type (figure~\ref{vectorfig} B). The variable \varcode{test} in
\figref{vectorfig} for example stores four integer values.
\begin{figure}[ht]
\includegraphics[width=0.8\columnwidth]{scalarArray}
\titlecaption{Scalars and vectors.}{\textbf{A)} A scalar variable
holds exactly on value. \textbf{B)} A vector can hold multiple
values. These must be of the same data type (e.g. integer
numbers). \matlab{} distinguishes between row- and
column-vectors.}\label{vectorfig}
\end{figure}
The following listing (\ref{generatevectorslisting} shows how vectors
can be created. In lines 5 and 9 the \code[Operator!Matrix!:]{:}
notation is used to easily create vectors with many elements or with
step-sizes unequal to 1. Line 5 can be read like: ``Create a variable
\varcode{b} and assign the values from 0 to 9 in increasing steps of
1.''. Line 9 reads: ``Create a variable \varcode{c} and assign the
values from 0 to 10 in steps of 2''.
\pagebreak
\begin{lstlisting}[label=generatevectorslisting, caption={Creating simple row-vectors.}]
>> a = [0 1 2 3 4 5 6 7 8 9] % Creating a row-vector
a =
0 1 2 3 4 5 6 7 8 9
>> b = (0:9) % more comfortable
b =
0 1 2 3 4 5 6 7 8 9
>> c = (0:2:10)
c =
0 2 4 6 8 10
\end{lstlisting}
The length of a vector, that is the number of elements, can be
requested using the \code{length()} or \code{numel()}
functions. \code{size()} provides the same information in a slightly,
yet more powerful way (listing~\ref{vectorsizeslisting}). The above
used vector \varcode{a} has the following size:
\begin{lstlisting}[label=vectorsizeslisting, caption={Size of a vector.}]
>> length(a)
ans =
10
>> size(a)
ans =
1 10
\end{lstlisting}
The answer provided by the \code{size()} function demonstrates that
vectors are nothing else but 2-dimensional matrices in which one
dimension has the size 1 (singleton dimension).
\code[length()]{length(a)} in line 1 just returns the size of the
largest dimension. Listing~\ref{columnvectorlisting} shows how to
create a column-vector and how the \code[Operator!Matrix!']{'} ---
operator is used to transpose the column-vector into a row-vector
(lines 14 and following).
\begin{lstlisting}[label=columnvectorlisting, caption={Column-vectors.}]
>> b = [1; 2; 3; 4; 5; 6; 7; 8; 9; 10] % Creating a column-vector
b =
1
2
...
9
10
>> length(b)
ans =
10
>> size(b)
ans =
10 1
>> b = b' % Transpose
b =
1 2 3 4 5 6 7 8 9 10
>> size(b)
ans =
1 10
\end{lstlisting}
\subsubsection{Accessing elements of a vector}
The content of a vector is accessed using the element's index
(figure~\ref{vectorindexingfig}). Each element has an individual
\codeterm{index} that ranges (int \matlab{}) from 1 to the number of
elements irrespective of the type of vector.
\begin{figure}[ht]
\includegraphics[width=0.4\columnwidth]{arrayIndexing}
\titlecaption{Index.}{Each element of a vector can be addressed via
its index (small numbers) to access its content (large
numbers).}\label{vectorindexingfig}
\end{figure}
\begin{important}[Indexing]
Elements of a vector are accessed via their index. This process is
called \codeterm{indexing}.
In \matlab{} the first element has the index one.
The last element's index equals the length of the vector.
\end{important}
Listings~\ref{vectorelementslisting} and~\ref{vectorrangelisting} show
how the index is used to access elements of a vector. One can access
individual values by providing a single index or use the
\code[Operator!Matrix!:]{:} notation to access multiple values with a
single command.
\begin{lstlisting}[label=vectorelementslisting, caption={Access to individual elements of a vector.}]
>> a = (11:20)
a =
11 12 13 14 15 16 17 18 19 20
>> a(1) % the 1. element
ans = 11
>> a(5) % the 5. element
ans = 15
>> a(end) % the last element
ans = 20
\end{lstlisting}
\begin{lstlisting}[caption={Access to multiple elements.}, label=vectorrangelisting]
>> a([1 3 5]) % 1., 3. and 5. element
ans =
11 13 15
>> a(2:4) % all elements with the indices 2 to 4
ans =
12 13 14
>> a(1:2:end) % every second element
ans =
11 13 15 17 19
>> a(:) % all elements as row-vector
ans =
11 12 13 14 15 16 17 18 19 20
\end{lstlisting}
\begin{exercise}{vectorsize.m}{vectorsize.out}
Create a row-vector \varcode{a} with 5 elements. The return value of
\code[size()]{size(a)} is a again a vector with the length 2. How
could you find out the size of the \varcode{a} in the 2nd dimension?
\end{exercise}
\subsubsection{Operations on vectors}
Similarly to the scalar variables discussed above we can work with
vectors and do calculations. Listing~\ref{vectorscalarlisting} shows
how vectors and scalars can be combined with the operators \code[Operator!arithmetic!1add@+]{+},
\code[Operator!arithmetic!2sub@-]{-},
\code[Operator!arithmetic!3mul@*]{*},
\code[Operator!arithmetic!4div@/]{/}
\code[Operator!arithmetic!5powe@.\^{}]{.\^}.
\begin{lstlisting}[caption={Calculating with vectors and scalars.},label=vectorscalarlisting]
>> a = (0:2:8)
a =
0 2 4 6 8
>> a + 5 % adding a scalar
ans =
5 7 9 11 13
>> a - 5 % subtracting a scalar
ans =
-5 -3 -1 1 3
>> a * 2 % multiplication
ans =
0 4 8 12 16
>> a / 2 % division
ans =
0 1 2 3 4
>> a .^ 2 % exponentiation
ans =
0 4 16 36 64
\end{lstlisting}
When doing calculations with scalars and vectors the same mathematical
operation is done to each element of the vector. In case of, e.g. an
addition this is called an element-wise addition.
Care has to be taken when you do calculations with two vectors. For
element-wise operations of two vectors, e.g. each element of vector
\varcode{a} should be added to the respective element of vector
\varcode{b} the two vectors must have the same length and the same
layout (row- or column vectors). Addition and subtraction are always
element-wise (listing~\ref{vectoradditionlisting}).
\begin{lstlisting}[caption={Element-wise addition and subtraction of two vectors.},label=vectoradditionlisting]
>> a = [4 9 12];
>> b = [4 3 2];
>> a + b % addition
ans =
8 12 14
>> a - b % subtraction
ans =
0 6 10
>> c = [8 4];
>> a + c % both vectors must have the same length!
Error using +
Matrix dimensions must agree.
>> d = [8; 4; 2];
>> a + d % both vectors must have the same layout!
Error using +
Matrix dimensions must agree.
\end{lstlisting}
Element-wise multiplication and division to raise a vector to a given power requires a
different operator with a preceding '.'. \matlab{} defines the
following operators for element-wise operations on vectors
\code[Operator!arithmetic!3mule@.*]{.*},
\code[Operator!arithmetic!4dive@./]{./} and
\code[Operator!arithmetic!5powe@.\^{}]{.\^{}}
(listing~\ref{vectorelemmultiplicationlisting}).
\begin{lstlisting}[caption={Element-wise multiplication, division and
exponentiation of two vectors.},label=vectorelemmultiplicationlisting]
>> a .* b % element-wise multiplication
ans =
16 27 24
>> a ./ b % element-wise division
ans =
1 3 6
>> a ./ b % element-wise exponentiation
ans =
256 729 144
>> a .* c % both vectors must have the same size!
Error using .*
Matrix dimensions must agree.
>> a .* d % Both vectors must have the same layout!
Error using .*
Matrix dimensions must agree.
\end{lstlisting}
The simple operators \code[Operator!arithmetic!3mul@*]{*},
\code[Operator!arithmetic!4div@/]{/} and
\code[Operator!arithmetic!5pow@\^{}]{\^{}} execute the respective
matrix-operations known from linear algebra (Box~
\ref{matrixmultiplication}). As a special case is the multiplication
of a row-vectors $\vec a$ with a column-vector $\vec b$ the
scalar-poduct (or dot-product) $\sum_i = a_i b_i$.
\begin{lstlisting}[caption={Multiplication of vectors.},label=vectormultiplicationlisting]
>> a * b % multiplication of two vectors
Error using *
Inner matrix dimensions must agree.
>> a' * b' % multiplication of column-vectors
Error using *
Inner matrix dimensions must agree.
>> a * b' % multiplication of a row- and column-vector
ans =
67
>> a' * b % multiplication of a column- and a row-vector
ans =
16 12 8
36 27 18
48 36 24
\end{lstlisting}
\pagebreak[4]
To remove elements from a vector an empty value
(\code[Operator!Matrix!{[]}]{[]}) is assigned to the respective
elements:
\begin{lstlisting}[label=vectoreraselisting, caption={Deleting elements of a vector.}]
>> a = (0:2:8);
>> length(a)
ans = 5
>> a(1) = [] % delete the 1st element
a = 2 4 6 8
>> a([1 3]) = [] % delete the 1st and 3rd element
a = 4 8
>> length(a)
ans = 2
\end{lstlisting}
In addition to deleting of vector elements one also add new elements
or concatenate two vectors. When performing a concatenation the two
concatenated vectors must match in their layout
(listing~\ref{vectorinsertlisting}, Line 11). To extend a vector we
can simply assign values beyond the end of the vector (line 21 in
listing~ \ref{vectorinsertlisting}). \matlab{} will automatically
adjust the variable. This way of extending a vector on-the-fly is
however expensive. In the background \matlab{} has to reserve new
memory of the appropriate size and then copies the contents into
it. If possible this should be avoided (the \matlab{} editor will warn
you).
\begin{lstlisting}[caption={Concatenation and extension of vectors.}, label=vectorinsertlisting]
>> a = [4 3 2 1];
>> b = [10 12 14 16];
>> c = [a b] % create a new vector by concatenation
c =
4 3 2 1 10 12 14 16
>> length(c)
ans = 8
>> length(a) + length(b)
ans = 8
>> c = [a b']; % vector layouts must match
Error using horzcat
Dimensions of matrices being concatenated are not consistent.
>> a(1:3) = [5 6 7] % assign new values to elements of the vector
a =
5 6 7 1
>> a(1:3) = [1 2 3 4]; % range of indices and list of new values must match
In an assignment A(I) = B, the number of elements in B and I must be the same.
>> a(3:6) = [1 2 3 4] % extending by assigning beyond vector bounds
a =
5 6 1 2 3 4
\end{lstlisting}
\subsection{Matrices}
Vectors are a special case of the more general data structure,
i.e. the matrix. Vectors are matrices in which one dimension is a
singleton dimension (length of 1). While matrices can have an almost
arbitrary number of dimensions the most common matrices are two- to three
dimensional (figure~\ref{matrixfig} A, B).
\begin{figure}
\includegraphics[width=0.5\columnwidth]{matrices}
\titlecaption{Matrices.}{\textbf{A)} 2-dimensional matrix with the
name ``test''. \textbf{B)} Illustration of a 3-dimensional
matrix. Arrows indicate the rank across the dimensions.}\label{matrixfig}
\end{figure}
Matrices can be created in similar ways as the vectors
(listing~\ref{matrixlisting}). The definition of a 2-D matrix is enclosed
into the square braces \code[Operator!Matrix!{[]}]{[]} the semicolon
operator \code[Operator!Matrix!;]{;} separates the individual rows of
a matrix.
\begin{lstlisting}[label=matrixlisting, caption={Creating matrices.}]
>> a = [1 2 3; 4 5 6; 7 8 9]
>> a =
1 2 3
4 5 6
7 8 9
>> b = ones(3, 4, 2)
b(:,:,1) =
1 1 1 1
1 1 1 1
1 1 1 1
b(:,:,2) =
1 1 1 1
1 1 1 1
1 1 1 1
\end{lstlisting}
The notation shown in line 1 is not suited to create matrices of
higher dimensions. For these, \matlab{} provides a number of
creator-functions that help creating n-dimensional matrices
(e.g. \code{ones()}, line 7 called with 3 arguments creates a 3-D
matrix). The function \code{cat()} allows to concatenate n-dimensional
matrices.
To request the length of a vector we used the function
\code{length()}. This function is \textbf{not} suited to request
information about the size of a matrix. As mentioned above,
\code{length()} would return the length of the largest dimension. The
function \code{size()} however, returns the length in each dimension
and should be always preferred over \code{length()}.
\begin{figure}
\includegraphics[width=0.9\columnwidth]{matrixIndexing}
\titlecaption{Indices in matrices.}{Each element of a matrix is
identified by its index. The index is a tuple of as many numbers
as the matrix has dimensions. The first coordinate in this tuple
counts the row, the second the column and the third the page,
etc. }\label{matrixindexingfig}
\end{figure}
Analogous to the data access in vectors we can address individual
elements of a matrix by it's index. Similar to a coordinate system
each element is addressed using a n-tuple with $n$ the number of
dimensions (figure~\ref{matrixindexingfig},
listing~\ref{matrixIndexing}). This type of indexing is called
\codeterm{subscript indexing}. The first coordinate refers always to
the row, the second to the column, the third to the page, and so on.
\begin{lstlisting}[caption={Accessing elements in matrices, indexing.}, label=matrixIndexing]
>> x=rand(3, 4) % 2-D matrix filled with random numbers
x =
0.8147 0.9134 0.2785 0.9649
0.9058 0.6324 0.5469 0.1576
0.1270 0.0975 0.9575 0.9706
>> size(x)
ans =
3 4
>> x(1,1) % top left corner
ans =
0.8147
>> x(2,3) % element in the 2nd row, 3rd column
ans =
0.5469
>> x(1,:) % the first row
ans =
0.8147 0.9134 0.2785 0.9649
>> x(:,2) % second column
ans =
0.9134
0.6324
0.0975
\end{lstlisting}
Subscript indexing is very intuitive but offers not always the most
straight-forward or efficient solution to the problem. Consider for
example that you have a 3-D matrix and you want the minimal number in
that matrix. One could try to first find the minimum in each column,
then compare it to the elements on each page, and so on. An
alternative way is to make use of the so called \emph{linar indexing}
in which each element of the matrix is addressed by a single
number. The linear index thus ranges from 1 to
\code{numel(matrix)}. The linear index increases first along the 1st,
2nd, 3rd etc. dimension (figure~\ref{matrixlinearindexingfig}). It is
not as intuitive since one would need to know the shape of the matrix and perform a remapping, but can be really helpful
(listing~\ref{matrixLinearIndexing}).
\begin{figure}
\includegraphics[width=0.9\columnwidth]{matrixLinearIndexing}
\titlecaption{Linear indexing in matrices.}{The linear index in a
matrix increases from 1 to the number of elements in the
matrix. It increases first along the first dimension, then the
rows in each column and so on.}\label{matrixlinearindexingfig}
\end{figure}
\begin{lstlisting}[label=matrixLinearIndexing, caption={Lineares indexing in matrices.}]
>> x = randi(100, [3, 4, 5]); % 3-D matrix filled with random numbers
>> size(x)
ans =
3 4 5
>> numel(x)
ans =
60
>> min(min(min(x))) % minimum across rows, then columns, then pages
ans =
4
>> min(x(1:numel(x))) % or like this
ans =
4
>> min(x(:)) % or even simpler
ans =
4
\end{lstlisting}
\matlab{} defines functions that convert subscript indices to linear indices and back (\code{sub2ind()} and \code{ind2sub()}).
\begin{ibox}[tp]{\label{matrixmultiplication} The matrix--multiplication.}
The matrix--multiplication from linear algebra is \textbf{not} an
element--wise multiplication of each element in a matrix \varcode{A}
and the respective element of matrix \varcode{B}. It is something
completely different. Confusing element--wise and
matrix--multiplication is one of the most common mistakes in
\matlab{}. \linebreak
The matrix--multiplication of two 2-D matrices is only possible if
the number of columns in the first matrix agrees with the number of
rows in the other. More formal: $\mathbf{A}$ and $\mathbf{B}$ can be
multiplied $(\mathbf{A} \cdot \mathbf{B})$, if $\mathbf{A}$ has the
size $(m \times n)$ and $\mathbf{B}$ the size $(n \times k)$. The
multiplication is possible if the \enterm{inner matrix dimensions} $n$
agree.
Then, the elements $c_{i,j}$ of the product $\mathbf{C} = \mathbf{A}
\cdot \mathbf{B}$ are given as the scalar product (dot-product) of
each row in $\mathbf{A}$ with each column in $\mathbf{B}$: \[
c_{i,j} = \sum_{k=1}^n a_{i,k} \; b_{k,j} \; . \]
The matrix-multiplication is not commutative, that is:
\[ \mathbf{A} \cdot \mathbf{B} \ne \mathbf{B} \cdot \mathbf{A} \; . \]
Consider the matrices:
\[\mathbf{A}_{(3 \times 2)} = \begin{pmatrix} 1 & 2 \\ 5 & 4 \\ -2 & 3 \end{pmatrix}
\quad \text{and} \quad \mathbf{B}_{(2 \times 2)} = \begin{pmatrix}
-1 & 2 \\ -2 & 5 \end{pmatrix} \; . \] The inner dimensions of
these matrices match ($(3 \times 2) \cdot (2 \times 2)$) and the
product of $\mathbf{C} = \mathbf{A} \cdot \mathbf{B}$ can be
calculated. Following from the number of rows in $\mathbf{A}$ (3)
and the number of columns in $\mathbf{B}$ (2) the resulting matrix
$\mathbf{C}$ will have the size $(3 \times 2)$:
\[ \mathbf{A} \cdot \mathbf{B} = \begin{pmatrix} 1 \cdot -1 + 2 \cdot -2 & 1 \cdot 2 + 2\cdot 5 \\
5 \cdot -1 + 4 \cdot -2 & 5 \cdot 2 + 4 \cdot 5\\
-2 \cdot -1 + 3 \cdot -2 & -2 \cdot 2 + 3 \cdot 5 \end{pmatrix}
= \begin{pmatrix} -5 & 12 \\ -13 & 30 \\ -4 & 11\end{pmatrix} \; . \]
The product of $\mathbf{B} \cdot \mathbf{A}$, however, is not
defined since the inner matrix dimensions do not agree ($(2 \times 2) \cdot
(3 \times 2)$).
\end{ibox}
Calculations on matrices apply the same rules as the calculations on
vectors. Element-wise computations are possible as long as the
matrices have the same dimensionality. It is again important to
distinguish between the element-wise
(\code[Operator!arithmetic!3mule@.*]{.*} operator, listing
\ref{matrixOperations} line 10) and the operator for
matrix-multiplication (\code[Operator!arithmetic!3mul@*]{*},
listing~\ref{matrixOperations} lines 14, 17 and 21,
box~\ref{matrixmultiplication}). To do a matrix-multiplication the
inner dimensions of the matrices must agree
(box~\ref{matrixmultiplication}).
\pagebreak[4]
\begin{lstlisting}[label=matrixOperations, caption={Two kinds of multiplications of matrices.}]
>> A = randi(5, [2, 3]) % 2-D matrix
A =
1 5 3
3 2 2
>> B = randi(5, [2, 3]) % dto.
B =
4 3 5
2 4 5
>> A .* B % element-wise multiplication
ans =
4 15 15
6 8 10
>> A * B % invalid matrix-multiplication
Error using *
Inner matrix dimensions must agree.
>> A * B' % valid matrix-multiplication
ans =
34 37
28 24
>> A' * B % matrix-multiplication is not commutative
ans =
10 15 20
24 23 35
16 17 25
\end{lstlisting}
\section{Boolean expressions}
Boolean expressions are instructions that can be evaluated to
\varcode{true} or \varcode{false}. In the context of programming they
are used to test the relations between entities, i.e. are entities the
same, greater or less than? Accordingly, programming languages define
operators for such instructions. The following \codeterm{relational
operators} are defined: (\code[Operator!relational!>]{>},
\code[Operator!relational!<]{<}, \code[Operator!relational!==]{==},
\code[Operator!relational!"~]{~}, greater than, less than, equal to,
and not. Using so called \codeterm[Operator!logical]{logical
operators} allows to join single Boolean expressions to more complex
constructs (\code[Operator!logical!and1@\&]{\&},
\code[Operator!logical!or1@{"|} {}]{|}, AND, OR). These expressions
are important e.g. to control which parts of the code are evaluated
under certain conditions (conditional statements,
Section~\ref{controlstructsec}) but also for accessing only those
elements of a vector or matrix that match a certain condition (logical
indexing, Section~\ref{logicalindexingsec}).
Truth tables (\ref{logicalandor}) are used to visualize the results of
Boolean expressions. A and B are statements that can be evaluated to
True or False. When they are combined with a logical AND the whole
expression is True only if both statements are True. The logical OR,
on the other hand, requires that at least one of the statements is
True. The exclusive OR (XOR) is True only for cases in which one of
the statements but not both are True. There is no operator for XOR in
\matlab{} it is realized via the function \code[xor()]{xor(A, B)}.
\begin{table}[tp]
\titlecaption{Truth tables for logical AND, OR and XOR.}{}\label{logicalandor}
\begin{tabular}{llll}
\multicolumn{2}{l}{\multirow{2}{*}{}} & \multicolumn{2}{c}{\textbf{B}} \\
& \sffamily{\textbf{AND}} & \multicolumn{1}{|c}{true} & false \\ \cline{2-4}
\multirow{2}{*}{\textbf{A}} & \multicolumn{1}{l|}{true} & \multicolumn{1}{c}{\textcolor{mygreen}{true}} & \textcolor{red}{false} \erb \\
& \multicolumn{1}{l|}{false} & \multicolumn{1}{l}{\textcolor{red}{false}} & \textcolor{red}{false}
\end{tabular}
\hfill
\begin{tabular}{llll}
\multicolumn{2}{l}{\multirow{2}{*}{}} & \multicolumn{2}{c}{\textbf{B}} \\
& \sffamily{\textbf{OR}} & \multicolumn{1}{|c}{true} & false \\ \cline{2-4}
\multirow{2}{*}{\textbf{A}} & \multicolumn{1}{l|}{true} & \multicolumn{1}{c}{\textcolor{mygreen}{true}} & \textcolor{mygreen}{true} \erb \\
& \multicolumn{1}{l|}{false} & \multicolumn{1}{l}{\textcolor{mygreen}{true}} & \textcolor{red}{false}
\end{tabular}
\hfill
\begin{tabular}{llll}
\multicolumn{2}{l}{\multirow{2}{*}{}} & \multicolumn{2}{c}{\textbf{B}} \\
& \sffamily{\textbf{XOR}} & \multicolumn{1}{|c}{true} & false \\ \cline{2-4}
\multirow{2}{*}{\textbf{A}} & \multicolumn{1}{l|}{true} & \multicolumn{1}{c}{\textcolor{red}{false}} & \textcolor{mygreen}{true} \erb \\
& \multicolumn{1}{l|}{false} & \multicolumn{1}{l}{\textcolor{mygreen}{true}} & \textcolor{red}{false}
\end{tabular}
\end{table}
Tables~\ref{logicalrelationaloperators} show the logical and
relational operators available in \matlab{}. The additional
\code[Operator!logical!and2@\&\&]{\&\&} and
\code[Operator!logical!or2@{"|}{"|} {}]{||} operators are the so
called `\enterm{short-circuit} operators for the logical OR and
AND. Short-circuit means that \matlab{} stops to evaluate a Boolean
expression as soon as it becomes clear that the whole expression
cannot become true. For example assume that the two statements A and B
are joined using a AND. The whole expression can only be true if A is
already true. This means, that there is no need to evaluate B if A is
false. Since the statements may be arbitrarily elaborated computations
this can save processing time.
\begin{table}[t]
\titlecaption{\label{logicalrelationaloperators}
Logical (left) and relational (right) operators in \matlab.}{}
\begin{tabular}{cc}
\hline
\textbf{operator} & \textbf{description} \erh \\ \hline
\varcode{$\sim$} & logical NOT \erb \\
\varcode{$\&$} & logical AND\\
\varcode{$|$} & logical OR\\
\varcode{$\&\&$} & short-circuit logical AND\\
\varcode{$\|$} & short-circuit logical OR\\
\hline
\end{tabular}
\hfill
\begin{tabular}{cc}
\hline
\textbf{operator} & \textbf{description} \erh \\ \hline
\varcode{$==$} & equals \erb \\
\varcode{$\sim=$} & unequal\\
\varcode{$>$} & greater than \\
\varcode{$<$} & less than \\
\varcode{$>=$} & greater or equal \\
\varcode{$<=$} & less or equal \\
\hline
\end{tabular}
\end{table}
\begin{important}[Assignment and equality operators]
The assignment operator \code[Operator!Assignment!=]{=} and the
logical equality operator \code[Operator!logical!==]{==} are
fundamentally different. Since they are colloquially treated equal
they can be easily confused.
\end{important}
Previously we have introduced the data types for integer or floating
point numbers and discussed that there are instances in which it is
more efficient to use a integer data type rather than storing floating
point numbers. The result of a Boolean expression can only assume two
values (true or false). This implies that we need only a single bit to
store this information as a 0 (false) and 1 (true). \matlab{} knows a
special data type (\codeterm{logical}) to store the result of a
Boolean expression. Every variable can be evaluated to true or false
by converting it to the logical data type. When doing so \matlab{}
interprets all values different form zero to be true. In
listing~\ref{booleanexpressions} we show several examples for such
operations. \matlab{} also knows the keywords \code{true} and
\code{false} which are synonyms for the \codeterm{logical} values 1
and 0.
\begin{lstlisting}[caption={Boolean expressions.}, label=booleanexpressions]
>> true
ans = 1
>> false
ans = 0
>> logical(1)
ans = 1
>> 1 == true
ans = 1
>> 1 == false
ans = 0
>> logical('test')
ans = 1 1 1 1
>> logical([1 2 3 4 0 0 10])
and = 1 1 1 1 0 0 1
>> 1 > 2
ans = 0
>> 1 < 2
ans = 1
>> x = [2 0 0 5 0] & [1 0 3 2 0]
x = 1 0 0 1 0
>> ~([2 0 0 5 0] & [1 0 3 2 0])
ans = 0 1 1 0 1
>> [2 0 0 5 0] | [1 0 3 2 0]
ans = 1 0 1 1 0
\end{lstlisting}
\section{Logical indexing}\label{logicalindexingsec}
With subscript or linear indexing one can select elements of a vector
or matrix by using their index. This is fine when we know the
indices. There are, however, many situations in which a selection is
based on the value of the stored elements and the indices are not
known in advance. Such selections are one of the major situations in
which Boolean expressions are employed. The selection based on the
result of a Boolean expression is called \enterm{logical
indexing}. With this approach we can easily filter based on the
values stored in a vector or matrix. It is very powerful and, once
understood, very intuitive.
The basic concept is that applying a Boolean operation on a vector
results in a \code{logical} vector of the same size (see
listing~\ref{booleanexpressions}). This logical vector is then used to
select only those values for which the logical vector is true. Line 14
in listing~\ref{logicalindexing1} can be read: ``Select all those
elements of \varcode{x} where the Boolean expression \varcode{x < 0}
evaluates to true and store the result in the variable
\emph{x\_smaller\_zero}''.
\begin{lstlisting}[caption={Logical indexing.}, label=logicalindexing1]
>> x = randn(1, 6) % a vector with 6 random numbers
x =
-1.4023 -1.4224 0.4882 -0.1774 -0.1961 1.4193
>> % logical indexing in two steps
>> x_smaller_zero = x < 0 % create the logical vector
x_smaller_zero =
1 1 0 1 1 0
>> elements_smaller_zero = x(x_smaller_zero) % use it to select
elements_smaller_zero =
-1.4023 -1.4224 -0.1774 -0.1961
>> % logical indexing with a single command
>> elements_smaller_zero = x(x < 0)
elements_smaller_zero =
-1.4023 -1.4224 -0.1774 -0.1961
\end{lstlisting}
\begin{exercise}{logicalVector.m}{logicalVector.out}
Create a vector \varcode{x} containing the values 0--10.
\begin{enumerate}
\item Execute: \varcode{y = x < 5}
\item Display the content of \varcode{y} in the command window.
\item What is the data type of \varcode{y}?
\item Return only those elements \varcode{x} that are less than 5.
\end{enumerate}
\pagebreak[4]
\end{exercise}
\begin{figure}[t]
\includegraphics[width= 0.9\columnwidth]{logicalIndexingTime}
\titlecaption{Example for logical indexing.} {The highlighted
segment of the data was selected using logical indexing on
the time vector: (\varcode{x(t > 5 \& t <
6)}).}\label{logicalindexingfig}
\end{figure}
So far we have used logical indexing to select elements of a vector
that match a certain condition. When analyzing data we are often
faced with the problem that we want to select the elements of one
vector for the case that the elements of a second vector assume a
certain value. One example for such a use-case is the selection of a
segment of data of a certain time span (the stimulus was on,
\figref{logicalindexingfig}).
\begin{exercise}{logicalIndexingTime.m}{}
Assume that measurements have been made for a certain time. Usually
measured values and the time are stored in two vectors.
\begin{itemize}
\item Create a vector that represents the recording time \varcode{t
= 0:0.001:10;}.
\item Create a second vector \varcode{x} filled with random number
that has the same length as \varcode{t}. The values stored in
\varcode{x} represent the measured data at the times in
\varcode{t}.
\item Use logical indexing to select those values that have been
recorded in the time span form 5--6\,s.
\end{itemize}
\end{exercise}
\begin{ibox}[ht]{\label{advancedtypesbox}Advanced data types}
Thoughout this script and the exercises we will limit ourselves to
the basic data types introduced above (int, double, char, scalars,
vectors, matrices and strings). There are, however, \matlab{}-
specific advanced data structures that make life easier (mostly). We
will introduce them breifly here and refer to the \matlab{} help for
further information. \textbf{Note: Some of these data types are
more recent additions to the matlab language. One should consider
this if downward compatibility is desired/needed.}
\textbf{Structures} Arrays of named fields that each can contain
arbitrary data types. \codeterm{Structures} can have sub-structures
and thus can build a trees. Structures are often used to combine
data and metadata in a single variable.
\textbf{Cell arrays} Arrays of variables that contain different
types. Unlike structures, the entries of a \codeterm{Cell array} are
not named. Indexing in \codeterm{Cell arrays} requires a special
operator the \code{\{\}}. \matlab{} uses \codeterm{Cell arrays} for
example when strings of different lengths should be stored in the
same variable: \varcode{months = \{'Januar', 'February', 'March',
'April', 'May', 'Jun'\};}. Note the curly braces that are used to
create the array and are also used for indexing.
\textbf{Tables} Tabular structure that allows to have columns of
varying type combined with a header (much like a spreadsheet).
\textbf{Timetables} Array of values that are associated with a
timestamp. For example one can store measurements, that are made in
irregular intervals togehter with the measurement time in a single
variable. Without the \codeterm{Timetable} data type at least two
variables (one storing the time, the other the measurement) would be
required. \codeterm{Timetables} offer specific convenience functions
to work with timestamps.
\textbf{Maps} In a \codeterm{map} a \codeterm{value} is associated
with an arbitrary \codeterm{key}. The \codeterm{key} is not
restricted to be an integer but can be almost anything. Maps are an
alternative to structures with the additional advantage that the key
can be used during indexing: \varcode{my\_map('key')}.
\textbf{Categorical arrays} are used to stored values that come from
a limited set of values, e.g. Months. Such categories can then be
used for filter operations. \codeterm{Categorical arrays} are often
used in conjunctions with \codeterm{Tables}.
\begin{lstlisting}[caption={Using Categorical arrays}, label=categoricallisting]
>> months = categorical({'Jan', 'Feb', 'Jan', 'Dec'});
>> events = [10, 2, 18, 20 ];
>> events(months == 'Jan')
ans =
10 18
\end{lstlisting}
\end{ibox}
\section{Control flow}\label{controlstructsec}
Generally, a program is executed line by line from top to
bottom. Sometimes this is not the desired behavior, or the other way
round, it is needed to skip certain parts or execute others
repeatedly. High-level programming languages like \matlab{} offer
statements that allow to manipulate the control flow. There are two
major classes of such statements:
\begin{enumerate}
\item loops
\item conditional expressions
\end{enumerate}
\subsection{Loops}
As the name already suggests loops are used to execute the same parts
of the code repeatedly. In one of the earlier exercises the factorial of
five has been calculated as depicted in listing~\ref{facultylisting}.
\begin{lstlisting}[caption={Calculation of the factorial of 5 in five steps}, label=facultylisting]
>> x = 1;
>> x = x * 2;
>> x = x * 3;
>> x = x * 4;
>> x = x * 5;
>> x
x =
120
\end{lstlisting}
This kind of program is fine but it is rather repetitive. The only
thing that changes is the increasing factor. The repetition of such
very similar lines of code is bad programming style. This is not only
a matter of taste but there are severe drawbacks to this style:
\begin{enumerate}
\item Error-proneness: ``Copy-and-paste'' often leads to the case that
the essential part of a repetition is not adapted (the factor in the
example above). \shortquote{Copy and paste is a design error.}{David
Parnas}
\item Flexibility: The aforementioned program does exactly one thing,
it cannot be used for any other other purpose (such as the factorial
of 6) without a change.
\item Maintenance: If there is an error, it has to be fixed in all
repetitions. It is easy to forget a single change.
\item Readability: repetitive code is terrible to read and to
understand. (I) one tends to skip repetitions (its the same,
anyways) and misses the essential change. (II), the duplication of
code leads to long and hard-to-parse programs.
\end{enumerate}
All imperative programming languages offer a solution: the loop. It is
used whenever the same commands have to be repeated.
\subsubsection{The \code{for} --- loop}
The most common type of loop is the \codeterm{for-loop}. It
consists of a \codeterm[Loop!head]{head} and the
\codeterm[Loop!body]{body}. The head defines how often the code of the
body is executed. In \matlab{} the head begins with the keyword
\code{for} which is followed by the \codeterm{running variable}. In
\matlab{} a for-loop always operates on vectors. With each
\codeterm{iteration} of the loop, the running variable assumes the
next value of this vector. In the body of the loop any code can be
executed which may or may not use the running variable for a certain
purpose. The \code{for} loop is closed with the keyword
\code{end}. Listing~\ref{looplisting} shows a simple version of such a
\code{for} loop.
\begin{lstlisting}[caption={Example of a \varcode{for}-loop.}, label=looplisting]
>> for x = 1:3 % head
disp(x) % body
end
% the running variable assumes with each iteration the next value
% of the vector 1:3:
1
2
3
\end{lstlisting}
\begin{exercise}{factorialLoop.m}{factorialLoop.out}
Can we solve the factorial with a for-loop? Implement a for loop that
calculates the factorial of a number \varcode{n}.
\end{exercise}
\subsubsection{The \varcode{while} --- loop}
The \code{while}--loop is the second type of loop that is available in
almost all programming languages. Other, than the \code{for} -- loop,
that iterates with the running variable over a vector, the while loop
uses a Boolean expression to determine when to execute the code in
it's body. The head of the loop starts with the keyword \code{while}
that is followed by a Boolean expression. If this can be evaluated to
true, the code in the body is executed. The loop is closed with an
\code{end}.
\begin{lstlisting}[caption={Basic structure of a \code{while} loop.}, label=whileloop]
while x == true % head with a Boolean expression
% execute this code if the expression yields true
end
\end{lstlisting}
\begin{exercise}{factorialWhileLoop.m}{}
Implement the factorial of a number \varcode{n} using a \code{while}
-- loop.
\end{exercise}
\begin{exercise}{neverendingWhile.m}{}
Implement a \code{while}--loop that is never-ending. Hint: the body
is executed as long as the Boolean expression in the head is
true. You can escape the loop by pressing \keycode{Ctrl+C}.
\end{exercise}
\subsubsection{Comparison \varcode{for} -- and \varcode{while} -- loop}
\begin{itemize}
\item Both execute the code in the body iterative.
\item When using a \code{for} -- loop the body of the loop is executed
at least once (except when the vector used in the head is empty).
\item In a \code{while} -- loop, the body is not necessarily
executed. It is entered only if the Boolean expression in the head
yields true.
\item The \code{for} -- loop is best suited for cases in which the
elements of a vector have to be used for a computation or when the
number of iterations is known.
\item The \code{while} -- loop is best suited for cases when it is not
known in advance how often a certain piece of code has to be
executed.
\item Any problem that can be solved with one type can also be solve
with the other type of loop.
\end{itemize}
\subsection{Conditional expressions}
The conditional expression are used to control that the enclosed code
is only executed under a certain condition.
\subsubsection{The \varcode{if} -- statement}
The most prominent representative of the conditional expressions is
the \code{if} statement (sometimes also called \code{if - else}
statement). It constitutes a kind of branching point. It allows to
control which branch of the code is executed.
Again, the statement consists of the head and the body. The head
begins with the keyword \code{if} followed by a Boolean expression
that controls whether or not the body is entered. Optionally, the body
can be either ended by the \code{end} keyword or followed by
additional statements \code{elseif}, which allows to add another
Boolean expression and to catch another condition or the \code{else}
the provide a default case. The last body of the \code{if - elseif -
else} statement has to be finished with the \code{end}
(listing~\ref{ifelselisting}).
\begin{lstlisting}[label=ifelselisting, caption={Structure of an \code{if} statement.}]
if x < y % head
% body I, executed only if x < y
elseif x > y
% body II, executed only if the first condition did not match and x > y
else
% body III, executed only if the previous conditions did not match
end
\end{lstlisting}
\begin{exercise}{ifelse.m}{}
Draw a random number and check with an appropriate \code{if}
statement whether it is
\begin{enumerate}
\item less than 0.5.
\item less or greater-or-equal 0.5.
\item (i) less than 0.5, (ii) greater-or-equal 0.5 but less than
0.75 or (iii) greater-or-equal to 0.75.
\end{enumerate}
\end{exercise}
\subsubsection{The \varcode{switch} -- statement}
The \code{switch} statement is used whenever a set of conditions
requires separate treatment. The statement is initialized with the
\code{switch} keyword that is followed by \emph{switch expression} (a
number or string). It is followed by a set of \emph{case expressions}
which start with the keyword \code{case} followed by the condition
that defines against which the \emph{switch expression} is tested. It
is important to note that the case expression always checks for
equality! Optional the case expressions may be followed by the keyword
\code{otherwise} which catches all cases that were not explicitly
stated above (listing~\ref{switchlisting}).
\begin{lstlisting}[label=switchlisting, caption={Structure of a \varcode{switch} statement.}]
mynumber = input('Enter a number:');
switch mynumber
case -1
disp('negative one');
case 1
disp('positive one');
otherwise
disp('something else');
end
\end{lstlisting}
\subsubsection{Comparison \varcode{if} and \varcode{switch} -- statements}
\begin{itemize}
\item Using the \code{if} statement one can test for arbitrary cases
and treat them separately.
\item The \code{switch} statement does something similar but is always
checks for the equality of \emph{switch} and \emph{case}
expressions.
\item The \code{switch} is a little bit more compact and nicer to read
if many different cases have to be handled.
\item The \code{switch} is used less often and can always be replaced
by an \code{if} statement.
\end{itemize}
\subsection{The keywords \code{break} and \code{continue}}
Whenever the execution of a loop should be ended or if you want to
skip the execution of the body under certain circumstances, one can
use the keywords \code{break} and \code{continue}
(listings~\ref{continuelisting} and \ref{continuelisting}).
\begin{lstlisting}[caption={Stop the execution of a loop using \varcode{break}.}, label=breaklisting]
>> x = 1;
while true
if (x > 3)
break;
end
disp(x);
x = x + 1;
end
% output:
1
2
3
\end{lstlisting}
\begin{lstlisting}[caption={Skipping iterations using \varcode{continue}.}, label=continuelisting]
for x = 1:5
if(x > 2 & x < 5)
continue;
end
disp(x);
end
% output:
1
2
5
\end{lstlisting}
\begin{exercise}{logicalIndexingBenchmark.m}{logicalIndexingBenchmark.out}
Above we claimed that logical indexing is faster and much more
convenient than the manual selection of elements of a vector. By now
we have all the tools at hand to test this. \\
For this test create a large vector with 100000 (or more) random
numbers. Filter from this vector all numbers that are less than 0.5
and copy them to a second vector. Surround you code with the brothers
\code{tic} and \code{toc} to have \matlab{} measure the time that
has passed between the calls of \code{tic} and \code{toc}.
\begin{enumerate}
\item Use a \code{for} loop to select matching values.
\item Use logical indexing.
\end{enumerate}
\end{exercise}
\begin{exercise}{simplerandomwalk.m}{}
Implement a 1-D random walk: Starting from the initial position $0$
the agent takes a step in a random direction.
\begin{itemize}
\item The program should do 10 random walks with 1000 steps each.
\item With each step decide randomly whether the position is changed
by $+1$ or $-1$.
\item Store all positions.
\item Create a figure in which you plot the position as a function
of the steps.
\end{itemize}
\end{exercise}
\section{Scripts and functions}
\subsection{What is a program?}
A program is little more than a collection of statement stored in a
file on the computer. When it is \emph{called}, it is brought to life
and executed line-by-line from top to bottom.
\matlab{} knows three types of programs:
\begin{enumerate}
\item \codeterm[Script]{Scripts}
\item \codeterm[Function]{Functions}
\item \codeterm[Object]{Objects} (not covered here)
\end{enumerate}
Programs are stored in so called \codeterm{m-files}
(e.g. \file{myProgram.m}). To use them they have to be \emph{called}
from the command line of within another program. Storing your code in
programs increases the re-usability. So far we have used
\emph{scripts} to store the solutions of the exercises. Any variable
that was created appeared in the \codeterm{workspace} and existed even
after the program was finished. This is very convenient but also bears
some risks. Consider the case that \file{script\_a.m} creates a
certain variable and assigns a value to it for later use. Now it calls
a second program (\file{script\_b.m}) that, by accident, uses the same
variable name and assigns a different value to it. When
\file{script\_b.m} is done, the control returns to \file{script\_a.m}
and if it now wants to read the previously stored variable, it will
contain a different value than expected. Bugs like this are hard to
find since each of the programs alone is perfectly fine and works as
intended. A solution for this problem are the
\codeterm[Function]{functions}.
\subsection{Functions}
Functions in \matlab{} are similar to mathematical functions
\[ y = f(x) \] Here, the mathematical function has the name $f$ and it
has one \codeterm{argument} $x$ that is transformed into the
function's output value $y$. In \matlab{} the syntax of a function
declaration is very similar (listing~\ref{functiondefinitionlisting}).
\begin{lstlisting}[caption={Declaration of a function in \matlab{}}, label=functiondefinitionlisting]
function [y] = functionName(arg_1, arg_2)
% ^ ^ ^
% return value argument_1, argument_2
\end{lstlisting}
The keyword \code{function} is followed by the return value(s) (it can
be a list \code{[]} of values), the function name and the
argument(s). The function head is then followed by the function's
body. A function is ended by and \code{end} (this is in fact optional
but we will stick to this). Each function that should be directly used
by the user (or called from other programs) should reside in an
individual \code{m-file} that has the same name as the function. By
using functions instead of scripts we gain several advantages:
\begin{itemize}
\item Encapsulation of program code that solves a certain task. It can
be easily re-used in other programs.
\item There is a clear definition of the function's interface. What
does the function need (the arguments) and what does it return (the
return values).
\item Separated scope:
\begin{itemize}
\item Variables that are defined within the function do not appear
in the workspace and cannot cause any harm there.
\item Variables that are defined in the workspace are not visible to
the function.
\end{itemize}
\item Functions increase re-usability.
\item Increase the legibility of programs since they are more clearly
arranged.
\end{itemize}
The following listing (\ref{badsinewavelisting}) shows a function that
calculates and displays a bunch of sine waves with different amplitudes.
\begin{lstlisting}[caption={Bad example of a function that displays a series of sine waves.},label=badsinewavelisting]
function myFirstFunction() % function head
t = (0:0.01:2);
frequency = 1.0;
amplitudes = [0.25 0.5 0.75 1.0 1.25];
for i = 1:length(amplitudes)
y = sin(frequency * t * 2 * pi) * amplituden(i);
plot(t, y)
hold on;
end
end
\end{lstlisting}
\code{myFirstFunction} (listing~\ref{badsinewavelisting}) is a
prime-example of a bad function. There are several issues with it's
design:
\begin{itemize}
\item The function's name does not tell anything about it's purpose.
\item The function is made for exactly one use-case (frequency of
1\,Hz and five amplitudes).
\item The function's behavior is \enterm{hard-coded} within it's body
and cannot be influenced without changing the function itself.
\item It solves three tasks at the same time: calculate sine
\emph{and} change the amplitude \emph{and} plot the result.
\item There is no way to access the calculated data.
\item No documentation. One has to read and understand the code to
learn what is does.
\end{itemize}
Before we can try to improve the function the task should be clearly
defined:
\begin{enumerate}
\item Which problem should be solved?
\item Can the problem be subdivided into smaller tasks?
\item Find good names for each task.
\item Define the interface. Which information is necessary to solve
each task and which results should be returned to the caller
(e.g. the user of another program that calls a function)?
\end{enumerate}
As indicated above the \code{myFirstFunction} does three things at
once, it seems natural, that the task should be split up into three
parts. (i) Calculation of the individual sine waves defined by the
frequency and the amplitudes (ii) graphical display of the data and
(iii) coordination of calculation and display.
\paragraph{I. Calculation of a single sine wave}
Before we start coding it is best to again think about the task and
define (i) how to name the function, (ii) which information it needs
(arguments), and (iii) what it should return to the caller.
\begin{enumerate}
\item \codeterm[Function!Name]{Name}: the name should be descriptive
of the function's purpose, i.e. the calculation of a sine wave. A
appropriate name might be \code{sinewave()}.
\item \codeterm[Function!Arguments]{Arguments}: What information does
the function need to do the calculation? There are obviously the
frequency as well as the amplitude. Further we may want to be able
to define the duration of the sine wave and the temporal
resolution. We thus need four arguments which should also named to
describe their content: \code{amplitude, frequency, t\_max,} and
\code{t\_step} might be good names.
\item \codeterm[Function!Return values]{Return values}: For a correct
display of the data we need two vectors. The time, and the sine wave
itself. We just need two return values: \varcode{time}, \varcode{sine}
\end{enumerate}
Having defined this we can start coding
(listing~\ref{sinefunctionlisting}).
\begin{lstlisting}[caption={Function that calculates a sine wave.}, label=sinefunctionlisting]
function [time, sine] = sinewave(frequency, amplitude, t_max, t_step)
% Calculate a sinewave of a given frequency, amplitude,
% duration and temporal resolution.
%
% [time, sine] = sinewave(frequency, amplitude, t_max, t_step)
%
% Arguments:
% frequency: the frequency of the sine
% amplitude: the amplitude of the sine
% t_max : the duration of the sine in seconds
% t_step : the temporal resolution in seconds
% Returns:
% time: vector of the time axis
% sine: vector of the calculated sinewave
time = (0:t_step:t_max);
sine = sin(frequency .* time .* 2 .* pi) .* amplitude;
end
\end{lstlisting}
\paragraph{II. Plotting a single sine wave}
The display of the sine waves can also be delegated to a function. We
can now decide whether we want the function to plot all sine waves at
once, or if we want design a function that plots a single sine wave
and that we then call repeatedly for each frequency/amplitude
combination. The most flexible approach is the latter and we will
thus implement it this way. We might come up with the following
specification of the function:
\begin{enumerate}
\item It should plot a single sine wave. But it is not limited to sine
waves. It's name is thus: \code{plotFunction()}.
\item What information does it need to solve the task? The
to-be-plotted data as there is the values \code{y\_data} and the
corresponding \code{x\_data}. As we want to plot series of sine
waves we might want to have a \code{name} for each function to be
displayed in the figure legend.
\item Are there any return values? No, this function is just made for
plotting, we do not need to return anything.
\end{enumerate}
With this specification we can start to implement the function
(listing~\ref{sineplotfunctionlisting}).
\begin{lstlisting}[caption={Function for the graphical display of data.}, label=sineplotfunctionlisting]
function plotFunction(x_data, y_data, name)
% Plots x-data against y-data and sets the display name.
%
% plotFunction(x_data, y_data, name)
%
% Arguments:
% x_data: vector of the x-data
% y_data: vector of the y-data
% name : the displayname
plot(x_data, y_data, 'displayname', name)
end
\end{lstlisting}
\paragraph{III. One script to rule them all}
The last task is to write a script to control the calculations and the
plotting of the sine waves. Classically, such controlling of sub-tasks
is handled in a script. It could be done with a function but if
there is a reason for the existence of scripts then it is this.
Again, we need to specify what needs to be done:
\begin{enumerate}
\item The task is to display multiple sine waves. We want to have a
fixed frequency but there should be various amplitudes displayed. An
appropriate name for the script (that is the name of the m-file)
might be \file{plotMultipleSinewaves.m}.
\item What information do we need? we need to define the
\code{frequency}, the range of \code{amplitudes}, the
\code{duration} of the sine waves, and the temporal resolution given
as the time between to points in time, i.e. the \code{stepsize}.
\item We then need to create an empty figure, and work through the
rang of \code{amplitudes}. We must not forget to switch \code{hold
on} if we want to see all the sine waves in one plot.
\end{enumerate}
The implementation is shown in listing~\ref{sinesskriptlisting}.
\begin{lstlisting}[caption={Control script for the plotting of sine waves.},label=sinesskriptlisting]
amplitudes = 0.25:0.25:1.25;
frequency = 2.0;
duration = 10.0; % seconds
stepsize = 0.01; % seconds
figure()
hold on
for i = 1:length(amplitudes)
[x_data, y_data] = sinewave(frequency, amplitudes(i), ...
duration, stepsize);
plotFunction(x_data, y_data, sprintf('freq: %5.2f, ampl: %5.2f',...
frequency, amplitudes(i)))
end
hold off
legend('show')
\end{lstlisting}
\begin{exercise}{plotMultipleSinewaves.m}{}
Extend the program to plot also a range of frequencies.
\pagebreak[4]
\end{exercise}
\begin{ibox}[t]{\label{whenscriptsbox}When to use scripts and functions}
It is easily possible to solve any programming problem only with
functions. Avoiding functions is also possible but ends very messy
and extremely hard to debug when the project grows. The effort for
avoiding naming conflicts or cleaning up the workspace increases
with the size of the project. Generally, functions should be
preferred over scripts in almost all cases. There are, however,
situations when a script offers advantages.
\begin{minipage}{0.5\textwidth}
\includegraphics[width=0.9\columnwidth]{simple_program}
\end{minipage}
\begin{minipage}{0.5\textwidth}
\textbf{Controlling a task.} Solving tasks that involve calling
sub-routines, as we did above, is one of these situations (see
figure). The script calls functions and takes care of passing the
correct arguments and storing the return values. \linebreak
\textbf{Interactive development.} During the development phase a
script grows as one \emph{interactively} works on the command
line. Commands that have been tested are then transferred to the
script.
\end{minipage}\vspace{0.25cm}
Interactive programming is one of the main strengths of
\matlab{}. Interactive refers to the interaction between the
commands executed on the command line and the variables stored in
the workspace. The immediate feedback if a certain operation works
on the data stored in a variable or if the returned results are
correct speeds up the developmental progress.
\textbf{Special solutions.} Program code that is only valid one very
specific problem may reside in a script. As soon as there is code
duplication or it grows too large, it is high time to consider
extracting features into separate functions.
\end{ibox}