[debugging chap] some initial work

2016-10-08 16:50:54 +02:00 · 2016-10-08 16:50:54 +02:00 · 99801c5344
commit 99801c5344
parent 70ec6c9cdb
1 changed files with 109 additions and 58 deletions
--- a/debugging/lecture/debugging.tex
+++ b/debugging/lecture/debugging.tex
@ -1,6 +1,113 @@
 \chapter{Debugging}
-\shortquote{60\% of coding time is finding errors}{Famous last words}
+When we write a program from scratch we almost always make
 mistakes. Accordingly a quite substantial amount of time is invested
 into finding and fixing errors. This process is called
 \codeterm{debugging}. Don't be frustrated that a self-written program
 does not work as intended and produces errors. It is quite exceptional
 if a program appears to be working on the first try and, in fact,
 should leave you suspicious.
 In this chapter we will talk about typical mistakes, how to read and
 understand error messages, how to actually debug your program code and
 some hints that help to minimize errors.
 \section{Types of errors and error messages}
 There are a number of different classes of programming errors.
 \paragraph{\codeterm{Syntax error}:}
 The most common and easiest to fix type of error. A syntax error
 violates the rules (spelling and grammar) of the programming
 language. For example every opening parenthesis must be matched by a
 closing one or every \keyword{for} loop has to be closed by an
 \keyword{end}. Usually, the respective error messages are clear and
 the editor will point out and highlight most \codeterm{syntax error}s.
 \begin{lstlisting}[label=syntaxerror, caption={Unbalanced parenthesis error.}]
 >> mean(random_numbers
                      |
 Error: Expression or statement is incorrect--possibly unbalanced (, {, or [.
 Did you mean:
 >>   mean(random_numbers)
 \end{lstlisting}
 \Paragraph{\codeterm{Indexing error}:}
 \paragraph{\codeterm{Assignment error}:}
 \paragraph{Name error:}
 \paragraph{Arithmetic error:}
 \paragraph{Logical error:}
 \section{Avoiding errors}
 It would be great if we could just sit down write a program, run it
 and be done. Most likely this will not happen. Rather, we will make
 mistakes and have to bebug the code. There are a few guidelines that
 help to reduce the number of errors.
 \subsection{Keep it small and simple}
 \shortquote{Debugging time increases as a square of the program's
  size.}{Chris Wenham}
 Break down your programming problems into small parts (functions) that
 do exactly one thing. This has already been discussed in the context
 of writing scripts and functions. In parts this is just a matter of
 feeling overwhelmed by 1000 lines of code. Further, with each task
 that you incorporate into the same script the probability of naming
 conflicts (same or similar names for variables) increases. Remembering
 the meaning of a certain variable that was defined in the beginning of
 the script is just hard.
 \shortquote{Everyone knows that debugging is twice as hard as writing
  a program in the first place. So if you're as clever as you can be
  when you write it, how will you ever debug it?}{Brian Kernighan}
 Many tasks within an analysis can be squashed into a single line of
 code. This saves some space in the file, reduces the effort of coming up
 with variable names and simply looks so much more competent than a
 collection of very simple lines. Consider the following listing
 (listing~\ref{easyvscomplicated}). Both parts of the listing solve the
 same problem but the second one breaks the task down to a sequence of
 easy-to-understand commands. Finding logical and also syntactic errors is
 much easier in the second case. The first version is perfectly fine
 but it requires a deep understanding of the applied
 functions and also the task at hand. 
 \begin{lstlisting}[label=easyvscomplicated, caption={Converting a series of spike times into the firing rate as a function of time. Many tasks can be solved with a single line of code. But is this readable?}]
 % the one-liner
 rate = conv(full(sparse(1, round(spike_times/dt), 1, 1, length(time))), kernel, 'same');
 % easier to read
 rate = zeros(size(time));
 spike_indices = round(spike_times/dt);
 rate(spike_indices) = 1;
 rate = conv(rate, kernel, 'same');
 \end{lstlisting}
 The preferred way depends on several considerations. (i)
 How deep is your personal understanding of the programming language?
 (ii) What about the programming skills of your target audience or
 other people that may depend on your code? (iii) Is one solution
 faster or uses less resources than the other? (iv) How much do you
 have to invest into the development of the most elegant solution
 relative to its importance in the project? The decision is up to you.
 \subsection{Read error messages carefully and call programs from the command line.}
 \section{Error messages}
 \begin{ibox}[tp]{\label{stacktracebox}Stacktrace or Stack Traceback}
 \end{ibox}
 Es hilft ungemein, wenn zusammengeh\"orige Skripte und Funktionen im
 gleichen Ordner auf der Festplatte zu finden sind. Es bietet sich also
@ -23,14 +130,6 @@ zu den Projektordnern einen \file{functions}-Ordner in dem Funktionen
 liegen, die in mehr als einem Projekt oder einer Analyse gebraucht
 werden.
 Beim Betrachten dieses Layouts f\"allt auf, dass es sehr
 wahrscheinlich ist, dass bestimmte Namen f\"ur Funktionen und Skripte
 mehrfach verwendet werden. Es ist nicht verwunderlich, wenn eine
 \file{load\_data.m} Funktion in jeder Analyse vorkommt. In der Regel
 wird dies nicht zu Konflikten f\"uhren, da \matlab{} zuerst im
 aktuellen Ordner nach passenden Dateien sucht (mehr Information zum
 \matlab-Suchpfad in Box~\ref{matlabpathbox}).
 \begin{figure}[tp]
  \includegraphics[width=0.75\textwidth]{no_bug}
  \titlecaption{\label{fileorganizationfig} M\"ogliche Organisation von
@ -41,36 +140,7 @@ aktuellen Ordner nach passenden Dateien sucht (mehr Information zum
 \end{figure}
-\begin{ibox}[tp]{\label{matlabpathbox}Der \matlab{} Suchpfad}
+\Section{Namensgebung von Funktionen und Skripten}
  Der Suchpfad definiert, wo \matlab{} nach Skripten und Funktionen
  sucht. Wird eine Funktion aufgerufen wird zun\"achst im aktuellen
  Arbeitsverzeichnis einem Treffer gesucht. Schl\"agt diese Suche
  fehl, so arbeitet sich \matlab{} durch den \codeterm{Suchpfad}
  (siehe Abbildung). Der \codeterm{Suchpfad} ist eine Liste von
  Ordnern in denen \matlab{} nach Funktionen und Skripten suchen
  soll. Die Suche nach der aufgerufenen Funktion wird dabei von oben
  nach unten durchgef\"uhrt. Das heisst, dass es bei
  Namensgleichheit eine Rolle spielen kann an welcher Stelle im
  Suchpfad der erste Treffer gefunden wird. Wichtig: \matlab{} sucht
  nicht rekursiv! Wenn die gew\"unschte Funktion in einem Unterordner
  des aktuellen Arbeitsverzeichnisses liegt, dieses aber nicht
  explizit im Suchpfad enthalten ist, so wird die Funktion nicht
  gefunden.
  Der Suchpfad kann sowohl \"uber die Kommandozeile mit dem Kommandos
  \code{addpath()} und \code{userpath()} als auch\"uber die in der
  Abbildung gezeigte GUI angezeigt und eingestellt werden. Die GUI
  erlaubt Ordner aus dem Suchpfad zu entfernen, neue Ordner (optional
  inklusive aller Unterordner) hinzuzuf\"ugen oder die Reihenfolge der
  Pfade zu ver\"andern.
  Zum Wechseln des aktuellen Arbeitsverzeichnisses wird das Kommando
  \code{cd} verwendet. \code{which} zeigt an, in welchem Pfad eine
  bestimmte Funktion gefunden wurde. Das aktuelle Areitsverzeichnis
  wird durch den Aufruf \code{pwd} auf der Kommandozeile ausgegeben.
 \end{ibox}
 \section{Namensgebung von Funktionen und Skripten}
 \matlab{} sucht Funktionen und Skripte ausschlie{\ss}lich anhand des
 Namens. Dabei spielt die Gro{\ss}- und Kleinschreibung eine Rolle. Die
@ -115,26 +185,7 @@ Benennung der in \matlab{} vordefinierten Funktionen gewissen Mustern:
 \begin{lstlisting}[label=chaoticcode, caption={Un\"ubersichtliche Implementation des Random-walk.}]
 num_runs = 10;
 max_steps = 1000;
 positions = zeros(max_steps, num_runs);
 for run = 1:num_runs
 for step = 2:max_steps
 x = randn(1);
 if x<0
 positions(step, run)= positions(step-1, run)+1;
 elseif x>0
 positions(step,run)=positions(step-1,run)-1;
 end
 end
 end
 \end{lstlisting}
 \pagebreak[4]