From c41ddcce90923e1f946accf5e785e5be6d7089ae Mon Sep 17 00:00:00 2001 From: Jan Grewe Date: Mon, 16 Oct 2017 18:23:01 +0200 Subject: [PATCH] [debugging] some progress --- debugging/lecture/debugging.tex | 150 +++++++++++++++++++++++++++----- 1 file changed, 126 insertions(+), 24 deletions(-) diff --git a/debugging/lecture/debugging.tex b/debugging/lecture/debugging.tex index 6b61096..361f5f6 100644 --- a/debugging/lecture/debugging.tex +++ b/debugging/lecture/debugging.tex @@ -1,7 +1,10 @@ \chapter{Debugging} -When we write a program from scratch we almost always make -mistakes. Accordingly a quite substantial amount of time is invested +\centerline{\includegraphics[width=0.7\textwidth]{xkcd_debugger}\rotatebox{90}{\footnotesize\url{www.xkcd.com}}}\vspace{4ex} + + +When writing a program from scratch we almost always make +mistakes. Accordingly, a quite substantial amount of time is invested into finding and fixing errors. This process is called \codeterm{debugging}. Don't be frustrated that a self-written program does not work as intended and produces errors. It is quite exceptional @@ -14,9 +17,40 @@ some hints that help to minimize errors. \section{Types of errors and error messages} -There are a number of different classes of programming errors. +There are a number of different classes of programming errors and it +is good to know the common ones. When we make a programming error +there are some that will lead to corrupted syntax, or invalid +operations and \matlab{} will \codeterm{throw} an error. Throwing an +error ends the execution of a program and there will be an error +messages shown in the command window. With such messages \matlab{} +tries to explain what went wrong and provide a hint on the possible +cause. + +Bugs that lead to the termination of the execution may be annoying but +are generally easier to find and fix than logical errors that stay +hidden and the results of, e.g. an analysis, are seemingly correct. + +\begin{important}[Try --- catch] +There are ways to \codeterm{catch} errors during \codeterm{runtime} +(i.e. when the program is executed) and handle them in the program. + +\begin{lstlisting}[label=trycatch, caption={Try catch clause}] + try + y = function_that_throws_an_error(x); + catch + y = 0; + end +\end{lstlisting} + +This way of solving errors may seem rather convenient but is +risky. Having a function throwing an error and catching it in the +\codeterm{catch} clause will keep your command line clean but may +obscure logical errors! Take care when using the \codeterm{try-catch + clause}. +\end{important} + -\paragraph{\codeterm{Syntax error}:} +\subsection{\codeterm{Syntax error}} The most common and easiest to fix type of error. A syntax error violates the rules (spelling and grammar) of the programming language. For example every opening parenthesis must be matched by a @@ -28,26 +62,94 @@ the editor will point out and highlight most \codeterm{syntax error}s. >> mean(random_numbers | Error: Expression or statement is incorrect--possibly unbalanced (, {, or [. - + Did you mean: >> mean(random_numbers) \end{lstlisting} +\subsection{\codeterm{Indexing error}} +Second on the list of common errors are the indexing errors. Usually +\matlab{} gives rather precise infromation about the cause, once you +know what they mean. Consider the following code. + +\begin{lstlisting}[label=indexerror, caption={Indexing errors.}] +>> my_array = (1:100); +>> % first try: index 0 +>> my_array(0) +Subscript indices must either be real positive integers or logicals. + +>> % second try: negative index +>> my_array(-1) +Subscript indices must either be real positive integers or logicals. + +>> % third try: a floating point number +>> my_array(5.7) +Subscript indices must either be real positive integers or logicals. + +>> % fourth try: a character +>> my_array('z') +Index exceeds matrix dimensions. + +>> % fifth try: another character +>> my_array('A') +ans = + 65 % wtf ?!? +\end{lstlisting} + +The first two indexing attempts in listing \ref{indexerror_listing} +are rather clear. We are trying to access elements with indices that +are invalid. Remember, indices in \matlab{} start with 1. Negative +numbers and zero are not permitted. In the third attemp we index +using a floating point number. This fails because indices have to be +'integer' values. Using a character as an index (fourth attempt) +leads to a different error message that says that the index exceeds +the matrix dimensions. This indicates that we are trying to read data +behind the length of our variable \codevar{my\_array} which has 100 +elements. +One could have expected that the character is an invalid index, but +apparently it is valid but simply too large. The fith attempt +finally succeeds. But why? \matlab{} implicitely converts the +\codeterm{char} to a number and uses this number to address the +element in \varcode{my\_array}. + +\subsection{\codeterm{Assignment error}} +This error occurs when we want to write data into a vector. -\Paragraph{\codeterm{Indexing error}:} -\paragraph{\codeterm{Assignment error}:} \paragraph{Name error:} \paragraph{Arithmetic error:} -\paragraph{Logical error:} - -\section{Avoiding errors} +\section{Logical error} +Sometimes a program runs smoothly and terminates without any +error. This, however, does not necessarily mean that the program is +correct. We may have made a \codeterm{logical error}. Logical errors +are hard to find, \matlab{} has no chance to find this error and can +not help us fixing bugs origination from these. We are on our own but +there are a few strategies that should help us. + +\begin{enumerate} +\item Be sceptical: especially when a program executes without any + complaint on the first try. +\item Clean code: Structure your code that you can easily read + it. Comment, but only where necessary. Correctly indent your + code. Use descriptive variable and function names. +\item Keep it simple (below). +\item Read error messages, try to understand what \matlab{} wants to + tell. +\item Use scripts and functions and call them from the command + line. \matlab{} can then provide you with more information. It will + then point to the line where the error happens. +\item If you still find yourself in trouble: Apply debugging + strategies to find and fix bugs (below). +\end{enumerate} + + +\subsection{Avoiding errors} It would be great if we could just sit down write a program, run it and be done. Most likely this will not happen. Rather, we will make mistakes and have to bebug the code. There are a few guidelines that help to reduce the number of errors. -\subsection{Keep it small and simple} +\subsection{The Kiss principle: 'Keep it small and simple' or 'simple and stupid'} \shortquote{Debugging time increases as a square of the program's size.}{Chris Wenham} @@ -67,15 +169,15 @@ the script is just hard. when you write it, how will you ever debug it?}{Brian Kernighan} Many tasks within an analysis can be squashed into a single line of -code. This saves some space in the file, reduces the effort of coming up -with variable names and simply looks so much more competent than a +code. This saves some space in the file, reduces the effort of coming +up with variable names and simply looks so much more competent than a collection of very simple lines. Consider the following listing (listing~\ref{easyvscomplicated}). Both parts of the listing solve the same problem but the second one breaks the task down to a sequence of -easy-to-understand commands. Finding logical and also syntactic errors is -much easier in the second case. The first version is perfectly fine -but it requires a deep understanding of the applied -functions and also the task at hand. +easy-to-understand commands. Finding logical and also syntactic errors +is much easier in the second case. The first version is perfectly fine +but it requires a deep understanding of the applied functions and also +the task at hand. \begin{lstlisting}[label=easyvscomplicated, caption={Converting a series of spike times into the firing rate as a function of time. Many tasks can be solved with a single line of code. But is this readable?}] % the one-liner @@ -88,13 +190,13 @@ rate(spike_indices) = 1; rate = conv(rate, kernel, 'same'); \end{lstlisting} -The preferred way depends on several considerations. (i) -How deep is your personal understanding of the programming language? -(ii) What about the programming skills of your target audience or -other people that may depend on your code? (iii) Is one solution -faster or uses less resources than the other? (iv) How much do you -have to invest into the development of the most elegant solution -relative to its importance in the project? The decision is up to you. +The preferred way depends on several considerations. (i) How deep is +your personal understanding of the programming language? (ii) What +about the programming skills of your target audience or other people +that may depend on your code? (iii) Is one solution faster or uses +less resources than the other? (iv) How much do you have to invest +into the development of the most elegant solution relative to its +importance in the project? The decision is up to you. \subsection{Read error messages carefully and call programs from the command line.}