diff --git a/plotting/lecture/plotting.tex b/plotting/lecture/plotting.tex index 07e3ae4..56af0eb 100644 --- a/plotting/lecture/plotting.tex +++ b/plotting/lecture/plotting.tex @@ -419,14 +419,14 @@ output format (box\,\ref{graphicsformatbox}). \end{tabular} \end{minipage} - It is often meaningful to store of data plots generated by \matlab{} - using a vector graphics format. When in doubt they can usually be + It is advisable to store of data plots generated by \matlab{} + using a vector graphics format. In doubt they can usually be easily converted to a bitmap format. The way from a bitmap to a vector graphic is not possible without a loss in quality. Storing a - plot that contains a very large set of graphical elements (e.g.\,a + plot that contains very large sets of graphical elements (e.g.\,a raster-plot showing thousands of action potentials) may, on the other hand, lead to very large files that can be hard to - handle. Saving such a plot using a bitmap format may be more + handle. Saving such plots using a bitmap format may be more efficient. \end{ibox} @@ -584,24 +584,24 @@ its properties. See the \matlab{} help for more information. \begin{figure}[ht] \includegraphics[width=0.9\linewidth]{errorbars} - \titlecaption{Adding error bars to a line plot}{\textbf{A} + \titlecaption{Indicating the estimation error in plots.}{\textbf{A} symmetrical error around the mean (e.g.\ using the standard deviation). \textbf{B} Errorbars of an asymmetrical distribution of the data (note: the average value is now the median and the errors are the lower and upper quartiles). \textbf{C} A shaded area is used to illustrate the spread of the data. See - listing\,\ref{errorbarlisting}}\label{errorbarplot} + listing\,\ref{errorbarlisting} for A and C and listing\,\ref{errorbarlisting2} }\label{errorbarplot} \end{figure} -\lstinputlisting[caption={Illustrating estimation errors. Script that - creates \figref{errorbarplot}.}, +\lstinputlisting[caption={Illustrating estimation errors using error bars. Script that + creates \figref{errorbarplot}. A, B}, label=errorbarlisting, firstline=13, lastline=29, basicstyle=\ttfamily\scriptsize]{errorbarplot.m} \subsubsection{Fill} For a few years now it has become fancy to illustrate the error not using errorbars but by drawing a shaded area around the mean. Beside -their fancyness there is also a real argument in favor of using error +the fancyness there is also a real argument in favor of using error areas instead of errorbars: In case you have a lot of data points with respective errorbars such that they would merge in the figure it is cleaner and probably easier to read and handle if one uses an error @@ -613,8 +613,8 @@ with the vertex points of the polygon. For each x-value we now have two y-values (average minus error and average plus error). Further, we want the vertices to be connected in a defined order. One can achieve this by going back and forth on the x-axis; we append a reversed -version of the x-values to the original x-values using the \code{cat} -and inversion is done using the \code{fliplr} command (line 3 in +version of the x-values to the original x-values using \code{cat} and +\code{fliplr} for concatenation and inversion, respectively (line 3 in listing \ref{errorbarlisting2}; Depending on the layout of your data you may need concatenate along a different dimension of the data and use \code{flipud} instead). The y-coordinates of the polygon vertices @@ -625,27 +625,26 @@ property defines the transparency (or rather the opaqueness) of the area. The provided alpha value is a number between 0 and 1 with zero leading to invisibility and a value of one to complete opaqueness. Finally, we use the normal plot command to draw a line -connecting the average values. +connecting the average values (line 12). -\lstinputlisting[caption={Illustrating estimation errors. Script that - creates \figref{errorbarplot}.}, label=errorbarlisting2, +\lstinputlisting[caption={Illustrating estimation errors using a shaded area. Script that + creates \figref{errorbarplot} C.}, label=errorbarlisting2, firstline=30, basicstyle=\ttfamily\scriptsize]{errorbarplot.m} \subsection{Annotations, text} -Sometimes want to highlight certain parts of a plot or simply add an -annotation that does not fit or belong to the legend. In these cases -we can use the \code[text()]{text()} or -\code[annotation()]{annotation()} function to add this information to -the plot. While \varcode{text} simply prints out the given text string -at the defined position (for example line in +The \code[text()]{text()} or \code[annotation()]{annotation()} are +used for highlighting certain parts of a plot or simply adding an +annotation that does not fit or does not belong into the legend. +While \varcode{text} simply prints out the given text string at the +defined position (for example line in listing\,\ref{regularsubplotlisting}) the \varcode{annotation} function allows to add some more advanced highlights like arrows, lines, ellipses, or rectangles. Figure\,\ref{annotationsplot} shows some examples, the respective code can be found in listing\,\ref{annotationsplotlisting}. For more options consult the -documentation. +\matlab{} help. \begin{figure}[ht] \includegraphics[width=0.5\linewidth]{annotations} @@ -716,18 +715,16 @@ Lissajous figure. The basic steps are: - - \section{Summary} A good plot of scientific data displays the data completely and seriously without too many distractions. Misleading or suggestive plots as may result from perspective presentations, inappropriate -scaling of axes of symbols should be avoided. +scaling of axes and symbols should be avoided. \noindent When combining several line plots within the same figure one should consider adapting color \textbf{and} line style (solid, dashed, dotted. etc.) to make the distinguishable even in black-and-white -prints. Combinations of red and green are no good choice since they +prints. Combinations of red and green are not a good choice since they cannot be distinguished by people with red-green blindness. \vspace{2ex} @@ -737,6 +734,8 @@ Key ingredients for a good data plot: \item Complete labeling. \item Plotted lines and curves must be distinguishable. \item No suggestive or misleading presentation. -\item The right balance of line width, font size and size of the figure. +\item The right balance of line width, font size and size of the + figure, this may depend on the purpose, for presentations slightly + thicker lines help. \item Error bars wherever they are appropriate. \end{itemize} diff --git a/programmingstyle/code/calculateSines.m b/programmingstyle/code/calculateSines.m index 8a2139a..b25388e 100644 --- a/programmingstyle/code/calculateSines.m +++ b/programmingstyle/code/calculateSines.m @@ -1,5 +1,7 @@ function sines = calculateSines(x, amplitudes, frequencies) - % Function calculates sinewaves with all combinations of + % sines = calculateSines(x, amplitudes, frequencies) + % + % Function calculates sinewaves with all combinations of % given amplitudes and frequencies. % Arguments: x, a vector of radiants for which the sine should be % computed. @@ -8,11 +10,11 @@ function sines = calculateSines(x, amplitudes, frequencies) % % Returns: a 3-D Matrix of sinewaves, 2nd dimension represents % the amplitudes, 3rd the frequencies. - + sines = zeros(length(x), length(amplitudes), length(frequencies)); for i = 1:length(amplitudes) - sines(:,i,:) = sinesWithFrequencies(x, amplitudes(i), frequencies); + sines(:,i,:) = sinesWithFrequencies(x, amplitudes(i), frequencies); end end diff --git a/programmingstyle/lecture/programmingstyle.tex b/programmingstyle/lecture/programmingstyle.tex index 9a048e7..735082a 100644 --- a/programmingstyle/lecture/programmingstyle.tex +++ b/programmingstyle/lecture/programmingstyle.tex @@ -6,20 +6,19 @@ %\selectlanguage{ngerman} Cultivating a good code style not a matter of good taste but is -a key ingredient for understandability, maintainability and in the end +a key ingredient for understandability, maintainability and, in the end, facilitates reproducibility of scientific results. Programs should be written and structured in a way that supports outsiders as well the author himself --- a few weeks or months after it was written --- to understand the programs' rationale. Clean code pays off for the original author as well as others that are supposed to use the code. - Clean code addresses several issues: \begin{enumerate} \item The programs' structure. \item Naming of scripts and functions. \item Naming of variables and constants. -\item Application of indentation empty lines to define blocks. +\item Application of indentation and empty lines to define blocks. \item Use of comments and inline documentation. \item Delegation of repeated code to functions and dedicated subroutines. @@ -29,13 +28,12 @@ Clean code addresses several issues: While introducing scripts and functions we suggested a typical program layout (box\,\ref{whenscriptsbox}). The idea is to create a single -entry point by having one script that controls the rest of program by -managing data and results and calling functions that work on the data -and produce the results. Applying this structure makes it easy to -understand the flow of the program but two questions remain: (i) How -to organize the files on the file system and (ii) how to name them -that the controlling script is easily identified among the other -\codeterm{m-files}. +entry point by having one script that controls the rest of the program +by calling functions that work on the data and managing the +results. Applying this structure makes it easy to understand the flow +of the program but two questions remain: (i) How to organize the files +on the file system and (ii) how to name them that the controlling +script is easily identified among the other \codeterm{m-files}. Upon installation ``MATLAB'' creates a folder called \emph{MATLAB} in the user space (Windows: My files, Linux: Documents, MacOS: @@ -110,24 +108,11 @@ Box~\ref{matlabpathbox}). \end{lstlisting} \end{ibox} -\section{Naming scripts and functions} -\matlab{} will search the search path (Box \ref{matlabpathbox}) -exclusively by name. It is case-sensitive this implies that the files -\file{test\_function.m} and \file{Test\_function.m} are two different -things. It is self-evident that choosing such names is nonsensical -because the name contains no cue about the difference between the two -and it further tells close to nothing about the purpose. Finding good -names is not trivial sometimes it is harder than the programming -itself. Expressive names, however, pay off! Expressive means that the -name provides information about the purpose. - -\begin{important}[Naming scripts and functions] - Function and script names should be expressive in the sense that the - name provides information about the function's purpose - (\file{estimate\_firingrate.m} tells much more than - \file{exercise1.m}). Choosing a good name replaces large parts of - the documentation. -\end{important} +\section{Naming things} +The dictum of good code style is: ``Program code must be readable.'' +Expressive names are extraordinarily important in this respect. Even +if it is tricky to find expressive names that are not overly long, +naming should be taken seriously. \matlab{} has a few rules about names: Names must not start with a number, they must not contain blanks or other special characters like @@ -144,26 +129,41 @@ patterns: There are other common patterns such as the \emph{camelCase} in which the first character of compound words is capitalized. Other -conventions use the underscore to separate the individual words -\emph{snake\_case}. A function that counts the number of action +conventions use the underscore to separate the individual words ( +\emph{snake\_case}). A function that counts the number of action potentials could be named \file{spikeCount.m} or \file{spike\_count.m}. +The same naming rules apply for scripts and functions as well as +variables and constants. -\section{Naming variables and constants} +\subsection{Naming scripts and functions} +\matlab{} will search the search path (Box \ref{matlabpathbox}) +exclusively by name. This search is case-sensitive which implies that +the files \file{test\_function.m} and \file{Test\_function.m} are two +different things. It is self-evident that choosing such names is +nonsensical because the tiny difference in the name contains no cue +about the difference between the two versions and the function names +themselves tell close to nothing about the purpose. Finding good names +is not trivial. Sometimes it is harder than the programming +itself. Choosing \emph{expressive names} that provide information about a +function's purpose, however, pays off! + +\begin{important}[Naming scripts and functions] + Names of functions and scripts should be expressive in the sense + that the name provides information about the function's purpose. + (\file{estimate\_firingrate.m} tells much more than + \file{exercise1.m}). Choosing a good name replaces large parts of + the documentation. +\end{important} -\matlab{} applies the same rules for naming variables and constants as -for the naming of scripts and functions. The dictum of good -code style is: ``Program code must be readable.'' Expressive -names are extraordinarily important in this respect. Even if it is -tricky to find expressive names that are not overly long, naming -should be taken seriously. +\subsection{Naming variables and constants} While the names of scripts and functions describe the purpose, names of variables describe the stored content. A variable storing the -average number of actions potentials could be called +average number of actions potentials could be called\\ \varcode{average\_spike\_count}. If this variable is meant to store -multiple spike counts the plural form would be appropriate +multiple spike counts the plural form would be appropriate\\ (\varcode{average\_spike\_counts}). The control variables used in the head of a \code{for} loop are often @@ -190,7 +190,7 @@ to comprehend. Even though the \matlab{} language (as many others) does not enforce indentation, indentation is very powerful for defining coherent blocks. The \matlab{} editor supports this by an auto-indentation mechanism. A selected section of the code and be -automatically indented by pressing the \keycode{Ctrl-I} combination. +automatically indented by pressing \keycode{Ctrl-I}. Interspersing empty lines is very helpful to separate regions in the code that belong together. Too many empty lines, however lead to @@ -198,8 +198,11 @@ hard-to-read code because it might require more space than a granted by the screen and thus takes overview. The following two listings show basically the same implementation of a -random walk once in a rather chaotic version (listing -\ref{chaoticcode}) then in cleaner way (listing \ref{cleancode}) +random walk\footnote{A random walk is a simple simulation of Brownian + motion. In each simulation step an agent takes a step into a + randomly chosen direction.} once in a rather chaotic version +(listing \ref{chaoticcode}) then in cleaner way (listing +\ref{cleancode}) \begin{lstlisting}[label=chaoticcode, caption={Chaotic implementation of the random-walk.}] num_runs = 10; max_steps = 1000; @@ -245,25 +248,24 @@ end \section{Using comments} It is common to provide extra information about the meaning of program -code by adding comments to it. In \matlab{} comments are indicated by -the percent character \code{\%}. Anything that is written in the -respective line following the percent is ignored and considered a -comment. When used sparsely comments can immensely important for -understanding. Comments are short sentences that describe the meaning -of the (following) lines in the program code. During the initial -implementation of a function they can be used to guide the development -but have the tendency to blow up the code and decrease readability. By -choosing expressive variable and function names, most lines should be -self-explanatory. - -For example stating the obvious does not really help:\\ -\varcode{ x = x + 2; \% add two to x}\\ +code by adding comments. In \matlab{} comments are indicated by the +percent character \code{\%}. Anything that follows the percent +character in a line is ignored and considered a comment. When used +sparsely comments can be immensely helpful. Comments +are short sentences that describe the meaning of the (following) lines +in the program code. During the initial implementation of a function +they can be used to guide the development but have the tendency to +blow up the code and decrease readability. By choosing expressive +variable and function names, most lines should be self-explanatory. + +For example stating the obvious does not really help and should be +avoided:\\ \varcode{ x = x + 2; \% add two to x}\\ \begin{important}[Using comments] \begin{itemize} \item Comments describe the rationale of the respective code block. - \item Comments are good and helpful --- they have to be true, however! - \item A wrong comment is worse than a non-existent comment! + \item Comments are good and helpful --- they must be true, however! + \item A wrong comment is worse than a non-existent one! \item Comments must be maintained just as the code. Otherwise they may become wrong and worse than meaningless! \end{itemize} @@ -303,7 +305,7 @@ well documented function. Comments and empty lines are used to organize code into logical blocks and to briefly explain what they do. Whenever one feels tempted to do this, one could also consider to delegate the respective task to a -function. In most cases this is preferable. +function. In most cases this is preferable. Not delegating the tasks leads to very long \codeterm{m-files} which can be confusing. Sometimes such a code is called ``spaghetti @@ -325,9 +327,9 @@ Generally, functions live in their own \codeterm{m-files} that have the same name as the function itself. Delegating tasks to functions thus leads to a large set of \codeterm{m-files} which increases complexity and may lead to confusion. If the delegated functionality -is used in multiple instances, it is advisable to do so. On the other -hand, when the delegated functionality is only used within the context -of another function \matlab{} allows to define +is used in multiple instances, it is still advisable to do so. On the +other hand, when the delegated functionality is only used within the +context of another function \matlab{} allows to define \codeterm[function!local]{local functions} and \codeterm[function!nested]{nested functions} within the same file. Listing \ref{localfunctions} shows an example of a local @@ -336,17 +338,19 @@ function definition. \pagebreak[3] \lstinputlisting[label=localfunctions, caption={Example for local functions.}]{calculateSines.m} -Local function live in the same \codeterm{m-file} as the main function -and are only available in this context. Each local function has its -own \codeterm{scope}, that is, the local function can not access (read -or write) variables of the calling function. +\emph{Local function} live in the same \codeterm{m-file} as the main +function and are only available in this context. Each local function +has its own \codeterm{scope}, that is, the local function can not +access (read or write) variables of the calling function. Interaction +with the local function requires to pass all required arguments and to +take care of the return values of the function. -This is different in so called \codeterm[function!nested]{nested - functions}. These are defined within the body of the parent function -(between the keywords \code{function} and \code{end}) and have full -access to all variables defined in the parent function. Working (in -particular changing) the parent's variables is handy on the one side, -but is also risky. One should take care when defining nested functions. +\emp{Nested functions} are different in this respect. They are +defined within the body of the parent function (between the keywords +\code{function} and \code{end}) and have full access to all variables +defined in the parent function. Working (in particular changing) the +parent's variables is handy on the one side, but is also risky. One +should take care when defining nested functions. \section{Specifics when using scripts} @@ -355,7 +359,7 @@ A similar problem as with nested function arises when using scripts become available in the global \codeterm{Workspace}. There is the risk of name conflicts, that is, a called sub-script redefines or uses the same variable name and may \emph{silently} change its content. The -user will not be notified by this change and the calling script may +user will not be notified about this change and the calling script may expect a completely different content. Bugs that are based on such mistakes are hard to find since the program itself looks perfectly fine. @@ -379,12 +383,15 @@ provides important information to track and fix the bug. \item Scripts should work independently of existing variables in the global workspace. - \item It is advisable to start a script with deleting variables - (\code{clear}) from the workspace and most of the times it is also - good to close all open figures (\code{close all}). + \item Often it is advisable to start a script with deleting + variables (\code{clear}) from the workspace and most of the times + it is also good to close all open figures (\code{close all}). Be + careful if a the respective script has been called by another one. \item Clean up the workspace at the end of a script. Delete (\code{clear}) all variables that are no longer needed. + + \item Consider to write functions instead of scripts. \end{itemize} \end{important} diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex index d94374c..dae32aa 100644 --- a/regression/lecture/regression.tex +++ b/regression/lecture/regression.tex @@ -383,7 +383,7 @@ introduce how this can be done without using the gradient descent Problems that involve nonlinear computations on parameters, e.g. the rate $\lambda$ in the exponential function $f(x;\lambda) = -\exp(\lambda x)$, do not have an analytical solution. To find minima +e^{\lambda x}$, do not have an analytical solution. To find minima in such functions numerical methods such as the gradient descent have to be applied. @@ -396,7 +396,7 @@ objective functions while more specialized functions are specifically designed for optimizations in the least square error sense \matlabfun{lsqcurvefit()}. -\newpage +%\newpage \begin{important}[Beware of secondary minima!] Finding the absolute minimum is not always as easy as in the case of the linear equation. Often, the error surface has secondary or local