[debugging] some progress

2017-10-16 18:23:01 +02:00 · 2017-10-16 18:23:01 +02:00 · c41ddcce90
commit c41ddcce90
parent 822cc2411b
1 changed files with 125 additions and 23 deletions
--- a/debugging/lecture/debugging.tex
+++ b/debugging/lecture/debugging.tex
@ -1,7 +1,10 @@
 \chapter{Debugging}

-When we write a program from scratch we almost always make
-mistakes. Accordingly a quite substantial amount of time is invested
+\centerline{\includegraphics[width=0.7\textwidth]{xkcd_debugger}\rotatebox{90}{\footnotesize\url{www.xkcd.com}}}\vspace{4ex}
+
+
+When writing a program from scratch we almost always make
+mistakes. Accordingly, a quite substantial amount of time is invested
 into finding and fixing errors. This process is called
 \codeterm{debugging}. Don't be frustrated that a self-written program
 does not work as intended and produces errors. It is quite exceptional
@ -14,9 +17,40 @@ some hints that help to minimize errors.

 \section{Types of errors and error messages}

-There are a number of different classes of programming errors.
+There are a number of different classes of programming errors and it
+is good to know the common ones. When we make a programming error
+there are some that will lead to corrupted syntax, or invalid
+operations and \matlab{} will \codeterm{throw} an error. Throwing an
+error ends the execution of a program and there will be an error
+messages shown in the command window. With such messages \matlab{}
+tries to explain what went wrong and provide a hint on the possible
+cause.

-\paragraph{\codeterm{Syntax error}:}
+Bugs that lead to the termination of the execution may be annoying but
+are generally easier to find and fix than logical errors that stay
+hidden and the results of, e.g. an analysis, are seemingly correct.
+
+\begin{important}[Try --- catch]
+There are ways to \codeterm{catch} errors during \codeterm{runtime}
+(i.e. when the program is executed) and handle them in the program.
+
+\begin{lstlisting}[label=trycatch, caption={Try catch clause}]
+  try
+     y = function_that_throws_an_error(x);
+  catch
+     y = 0;
+  end
+\end{lstlisting}
+
+This way of solving errors may seem rather convenient but is
+risky. Having a function throwing an error and catching it in the
+\codeterm{catch} clause will keep your command line clean but may
+obscure logical errors! Take care when using the \codeterm{try-catch
+  clause}.
+\end{important}
+
+
+\subsection{\codeterm{Syntax error}}
 The most common and easiest to fix type of error. A syntax error
 violates the rules (spelling and grammar) of the programming
 language. For example every opening parenthesis must be matched by a
@ -28,26 +62,94 @@ the editor will point out and highlight most \codeterm{syntax error}s.
 >> mean(random_numbers
                      |
 Error: Expression or statement is incorrect--possibly unbalanced (, {, or [.
- 
+
 Did you mean:
 >>   mean(random_numbers)
 \end{lstlisting}

+\subsection{\codeterm{Indexing error}}
+Second on the list of common errors are the indexing errors. Usually
+\matlab{} gives rather precise infromation about the cause, once you
+know what they mean. Consider the following code.
+
+\begin{lstlisting}[label=indexerror, caption={Indexing errors.}]
+>> my_array = (1:100);
+>> % first try: index 0
+>> my_array(0)
+Subscript indices must either be real positive integers or logicals.
+
+>> % second try: negative index
+>> my_array(-1)
+Subscript indices must either be real positive integers or logicals.
+
+>> % third try: a floating point number
+>> my_array(5.7)
+Subscript indices must either be real positive integers or logicals.
+
+>> % fourth try: a character
+>> my_array('z')
+Index exceeds matrix dimensions.
+
+>> % fifth try: another character
+>> my_array('A')
+ans =
+     65 % wtf ?!?
+\end{lstlisting}
+
+The first two indexing attempts in listing \ref{indexerror_listing}
+are rather clear. We are trying to access elements with indices that
+are invalid. Remember, indices in \matlab{} start with 1. Negative
+numbers and zero are not permitted.  In the third attemp we index
+using a floating point number. This fails because indices have to be
+'integer' values.  Using a character as an index (fourth attempt)
+leads to a different error message that says that the index exceeds
+the matrix dimensions. This indicates that we are trying to read data
+behind the length of our variable \codevar{my\_array} which has 100
+elements.
+One could have expected that the character is an invalid index, but
+apparently it is valid but simply too large. The fith attempt
+finally succeeds. But why? \matlab{} implicitely converts the
+\codeterm{char} to a number and uses this number to address the
+element in \varcode{my\_array}.
+
+\subsection{\codeterm{Assignment error}}
+This error occurs when we want to write data into a vector.

-\Paragraph{\codeterm{Indexing error}:}
-\paragraph{\codeterm{Assignment error}:}
 \paragraph{Name error:}
 \paragraph{Arithmetic error:}
-\paragraph{Logical error:}
+
+\section{Logical error}
+Sometimes a program runs smoothly and terminates without any
+error. This, however, does not necessarily mean that the program is
+correct. We may have made a \codeterm{logical error}. Logical errors
+are hard to find, \matlab{} has no chance to find this error and can
+not help us fixing bugs origination from these. We are on our own but
+there are a few strategies that should help us.
+
+\begin{enumerate}
+\item Be sceptical: especially when a program executes without any
+  complaint on the first try.
+\item Clean code: Structure your code that you can easily read
+  it. Comment, but only where necessary. Correctly indent your
+  code. Use descriptive variable and function names.
+\item Keep it simple (below).
+\item Read error messages, try to understand what \matlab{} wants to
+  tell.
+\item Use scripts and functions and call them from the command
+  line. \matlab{} can then provide you with more information. It will
+  then point to the line where the error happens.
+\item If you still find yourself in trouble: Apply debugging
+  strategies to find and fix bugs (below).
+\end{enumerate}


-\section{Avoiding errors}
+\subsection{Avoiding errors}
 It would be great if we could just sit down write a program, run it
 and be done. Most likely this will not happen. Rather, we will make
 mistakes and have to bebug the code. There are a few guidelines that
 help to reduce the number of errors.

-\subsection{Keep it small and simple}
+\subsection{The Kiss  principle: 'Keep it small and simple' or 'simple and stupid'}

 \shortquote{Debugging time increases as a square of the program's
  size.}{Chris Wenham}
@ -67,15 +169,15 @@ the script is just hard.
  when you write it, how will you ever debug it?}{Brian Kernighan}

 Many tasks within an analysis can be squashed into a single line of
-code. This saves some space in the file, reduces the effort of coming up
-with variable names and simply looks so much more competent than a
+code. This saves some space in the file, reduces the effort of coming
+up with variable names and simply looks so much more competent than a
 collection of very simple lines. Consider the following listing
 (listing~\ref{easyvscomplicated}). Both parts of the listing solve the
 same problem but the second one breaks the task down to a sequence of
-easy-to-understand commands. Finding logical and also syntactic errors is
-much easier in the second case. The first version is perfectly fine
-but it requires a deep understanding of the applied
-functions and also the task at hand. 
+easy-to-understand commands. Finding logical and also syntactic errors
+is much easier in the second case. The first version is perfectly fine
+but it requires a deep understanding of the applied functions and also
+the task at hand.

 \begin{lstlisting}[label=easyvscomplicated, caption={Converting a series of spike times into the firing rate as a function of time. Many tasks can be solved with a single line of code. But is this readable?}]
 % the one-liner
@ -88,13 +190,13 @@ rate(spike_indices) = 1;
 rate = conv(rate, kernel, 'same');
 \end{lstlisting}

-The preferred way depends on several considerations. (i)
-How deep is your personal understanding of the programming language?
-(ii) What about the programming skills of your target audience or
-other people that may depend on your code? (iii) Is one solution
-faster or uses less resources than the other? (iv) How much do you
-have to invest into the development of the most elegant solution
-relative to its importance in the project? The decision is up to you.
+The preferred way depends on several considerations. (i) How deep is
+your personal understanding of the programming language?  (ii) What
+about the programming skills of your target audience or other people
+that may depend on your code? (iii) Is one solution faster or uses
+less resources than the other? (iv) How much do you have to invest
+into the development of the most elegant solution relative to its
+importance in the project? The decision is up to you.

 \subsection{Read error messages carefully and call programs from the command line.}