[debugging] some progress

This commit is contained in:
Jan Grewe 2017-10-16 18:23:01 +02:00
parent 822cc2411b
commit c41ddcce90

View File

@ -1,7 +1,10 @@
\chapter{Debugging}
When we write a program from scratch we almost always make
mistakes. Accordingly a quite substantial amount of time is invested
\centerline{\includegraphics[width=0.7\textwidth]{xkcd_debugger}\rotatebox{90}{\footnotesize\url{www.xkcd.com}}}\vspace{4ex}
When writing a program from scratch we almost always make
mistakes. Accordingly, a quite substantial amount of time is invested
into finding and fixing errors. This process is called
\codeterm{debugging}. Don't be frustrated that a self-written program
does not work as intended and produces errors. It is quite exceptional
@ -14,9 +17,40 @@ some hints that help to minimize errors.
\section{Types of errors and error messages}
There are a number of different classes of programming errors.
There are a number of different classes of programming errors and it
is good to know the common ones. When we make a programming error
there are some that will lead to corrupted syntax, or invalid
operations and \matlab{} will \codeterm{throw} an error. Throwing an
error ends the execution of a program and there will be an error
messages shown in the command window. With such messages \matlab{}
tries to explain what went wrong and provide a hint on the possible
cause.
Bugs that lead to the termination of the execution may be annoying but
are generally easier to find and fix than logical errors that stay
hidden and the results of, e.g. an analysis, are seemingly correct.
\begin{important}[Try --- catch]
There are ways to \codeterm{catch} errors during \codeterm{runtime}
(i.e. when the program is executed) and handle them in the program.
\begin{lstlisting}[label=trycatch, caption={Try catch clause}]
try
y = function_that_throws_an_error(x);
catch
y = 0;
end
\end{lstlisting}
This way of solving errors may seem rather convenient but is
risky. Having a function throwing an error and catching it in the
\codeterm{catch} clause will keep your command line clean but may
obscure logical errors! Take care when using the \codeterm{try-catch
clause}.
\end{important}
\paragraph{\codeterm{Syntax error}:}
\subsection{\codeterm{Syntax error}}
The most common and easiest to fix type of error. A syntax error
violates the rules (spelling and grammar) of the programming
language. For example every opening parenthesis must be matched by a
@ -28,26 +62,94 @@ the editor will point out and highlight most \codeterm{syntax error}s.
>> mean(random_numbers
|
Error: Expression or statement is incorrect--possibly unbalanced (, {, or [.
Did you mean:
>> mean(random_numbers)
\end{lstlisting}
\subsection{\codeterm{Indexing error}}
Second on the list of common errors are the indexing errors. Usually
\matlab{} gives rather precise infromation about the cause, once you
know what they mean. Consider the following code.
\begin{lstlisting}[label=indexerror, caption={Indexing errors.}]
>> my_array = (1:100);
>> % first try: index 0
>> my_array(0)
Subscript indices must either be real positive integers or logicals.
>> % second try: negative index
>> my_array(-1)
Subscript indices must either be real positive integers or logicals.
>> % third try: a floating point number
>> my_array(5.7)
Subscript indices must either be real positive integers or logicals.
>> % fourth try: a character
>> my_array('z')
Index exceeds matrix dimensions.
>> % fifth try: another character
>> my_array('A')
ans =
65 % wtf ?!?
\end{lstlisting}
The first two indexing attempts in listing \ref{indexerror_listing}
are rather clear. We are trying to access elements with indices that
are invalid. Remember, indices in \matlab{} start with 1. Negative
numbers and zero are not permitted. In the third attemp we index
using a floating point number. This fails because indices have to be
'integer' values. Using a character as an index (fourth attempt)
leads to a different error message that says that the index exceeds
the matrix dimensions. This indicates that we are trying to read data
behind the length of our variable \codevar{my\_array} which has 100
elements.
One could have expected that the character is an invalid index, but
apparently it is valid but simply too large. The fith attempt
finally succeeds. But why? \matlab{} implicitely converts the
\codeterm{char} to a number and uses this number to address the
element in \varcode{my\_array}.
\subsection{\codeterm{Assignment error}}
This error occurs when we want to write data into a vector.
\Paragraph{\codeterm{Indexing error}:}
\paragraph{\codeterm{Assignment error}:}
\paragraph{Name error:}
\paragraph{Arithmetic error:}
\paragraph{Logical error:}
\section{Avoiding errors}
\section{Logical error}
Sometimes a program runs smoothly and terminates without any
error. This, however, does not necessarily mean that the program is
correct. We may have made a \codeterm{logical error}. Logical errors
are hard to find, \matlab{} has no chance to find this error and can
not help us fixing bugs origination from these. We are on our own but
there are a few strategies that should help us.
\begin{enumerate}
\item Be sceptical: especially when a program executes without any
complaint on the first try.
\item Clean code: Structure your code that you can easily read
it. Comment, but only where necessary. Correctly indent your
code. Use descriptive variable and function names.
\item Keep it simple (below).
\item Read error messages, try to understand what \matlab{} wants to
tell.
\item Use scripts and functions and call them from the command
line. \matlab{} can then provide you with more information. It will
then point to the line where the error happens.
\item If you still find yourself in trouble: Apply debugging
strategies to find and fix bugs (below).
\end{enumerate}
\subsection{Avoiding errors}
It would be great if we could just sit down write a program, run it
and be done. Most likely this will not happen. Rather, we will make
mistakes and have to bebug the code. There are a few guidelines that
help to reduce the number of errors.
\subsection{Keep it small and simple}
\subsection{The Kiss principle: 'Keep it small and simple' or 'simple and stupid'}
\shortquote{Debugging time increases as a square of the program's
size.}{Chris Wenham}
@ -67,15 +169,15 @@ the script is just hard.
when you write it, how will you ever debug it?}{Brian Kernighan}
Many tasks within an analysis can be squashed into a single line of
code. This saves some space in the file, reduces the effort of coming up
with variable names and simply looks so much more competent than a
code. This saves some space in the file, reduces the effort of coming
up with variable names and simply looks so much more competent than a
collection of very simple lines. Consider the following listing
(listing~\ref{easyvscomplicated}). Both parts of the listing solve the
same problem but the second one breaks the task down to a sequence of
easy-to-understand commands. Finding logical and also syntactic errors is
much easier in the second case. The first version is perfectly fine
but it requires a deep understanding of the applied
functions and also the task at hand.
easy-to-understand commands. Finding logical and also syntactic errors
is much easier in the second case. The first version is perfectly fine
but it requires a deep understanding of the applied functions and also
the task at hand.
\begin{lstlisting}[label=easyvscomplicated, caption={Converting a series of spike times into the firing rate as a function of time. Many tasks can be solved with a single line of code. But is this readable?}]
% the one-liner
@ -88,13 +190,13 @@ rate(spike_indices) = 1;
rate = conv(rate, kernel, 'same');
\end{lstlisting}
The preferred way depends on several considerations. (i)
How deep is your personal understanding of the programming language?
(ii) What about the programming skills of your target audience or
other people that may depend on your code? (iii) Is one solution
faster or uses less resources than the other? (iv) How much do you
have to invest into the development of the most elegant solution
relative to its importance in the project? The decision is up to you.
The preferred way depends on several considerations. (i) How deep is
your personal understanding of the programming language? (ii) What
about the programming skills of your target audience or other people
that may depend on your code? (iii) Is one solution faster or uses
less resources than the other? (iv) How much do you have to invest
into the development of the most elegant solution relative to its
importance in the project? The decision is up to you.
\subsection{Read error messages carefully and call programs from the command line.}