\chapter{Graphical representation of scientific data} We may count the ability of adequately presenting scientific data to the core competences needed to do science. We need to present data in a meaningful way that supports understanding of the data and the results without biases. \begin{figure}[hb!] \includegraphics[width=0.9\columnwidth]{convincing} \titlecaption{The consequences of bad plots may be severe.}{\url{www.xkcd.com}}\label{xkcdplotting} \end{figure} \section{The \matlab{} plotting system} Plotting data in \matlab{} is rather straight forward for simple line plots. By calling \code[plot()]{plot(x, y)} a simple line plot will be created. The resulting figure, however is missing any annotations like axis labeling, a legend, etc.. There are two options to edit the plot: (i) the graphical user interface (GUI) or the command line. Both ways have their right to exist associated with respective pros and cons. The UI way of editing plots is ideal for experimenting, the command line approach is best suited for automation and to achieve a consistent layout across figures and graphs in a paper or thesis. \begin{figure} \begin{minipage}[t]{0.6\textwidth} \includegraphics[height=0.29\textheight]{plot_editor} \end{minipage} \begin{minipage}[t]{0.3\textwidth} \includegraphics[height=0.29\textheight]{property_editor} \end{minipage} \titlecaption{The graphical plot-editor.}{From the menu ``Tools $\rightarrow$ Edit Plot'' one can select the editor. Using the mouse you can select different parts of the current plot (axes, lines, the figure background, etc.) and the interface will change to allow modifying the properties. Some properties are not offered directly but hide behind the \emph{More Properties} button which will open the \emph{Property Editor}.}\label{ploteditorfig} \end{figure} \vspace{1ex} While it is very convenient to edit a figure using the GUI (\figref{ploteditorfig}), it is hard to re-create the exact same plot later on or transfer the changes done to one figure to another. \matlab{} figures consist of several graphical objects: \begin{enumerate} \item \enterm[figure]{Figure}: This object represents the whole drawing area, it holds properties like background color, the size of the figure/paper and the placement of the axes on the paper, etc.. \item \enterm[axes]{Axes}: The coordinate system for plotting the data. Defines properties like the scaling of the axes, the labeling, line widths, etc.. \item \enterm[lines]{Lines}: The drawn data lines. Holds properties like line width and color, the name associated with the line, marker size and many more. \item \enterm[annotations]{Annotations}: Annotations like textboxes and or arrows that can be used to highlight points or segments. \item \enterm[legends]{Legends}: Legends of the data plot. One can define the style of the legend, its placement in the plot, etc.. \end{enumerate} Each of these objects offers a number of settings some of them can be directly manipulated in the plot editor others are available via the property editor. \subsection{Avoiding manual editing of figures} All properties that can be manipulated with the graphical interfaces can also be edited using command line or the respective commands can be included in a script or function. Creating the plot from inside a script or function has the advantage that one can apply the same settings to several figures, re-create the figure automatically when the data was changed or the same kind of plot has to be created for a number of datasets. \begin{important}[Why manual editing should be avoided.] On first glance the manual editing of a figure using tools such as Corel draw, Illustrator, etc.\,appears much more convenient and less complex than coding everything into the analysis scripts. This, however, is not entirely true. What if the figure has to be re-drawn or updated? Then the editing work starts all over again. Rather, there is a great risk associated with the manual editing approach. Axes may be shifted, fonts have not been embedded into the final document, annotations have been copy pasted between figures and are not valid. All of these mistakes can be found in publications and then require an erratum, which is not desirable. Even if it appears more cumbersome in the beginning one should always try to create publication-ready figures directly from the data analysis tool using scripts or functions to properly layout the plot. \end{important} \subsection{Simple plotting} Creating a simple line-plot is rather easy. Assuming there exists a variable \varcode{y} in the \entermde{Arbeitsbereich}{workspace} that contains the measurement data it is enough to call \code[plot()]{plot(y)}. At the first call of this function a new \enterm{figure} will be opened and the data will be plotted with as a line plot. If you repeatedly call this function the current plot will be replaced unless the \code[hold]{hold on} command was issued before. If it was, the current plot is held and a second line will be added to it. Calling \code[hold]{hold off} will release the plot and any subsequent plotting will replace the previous plot. In our previous call to \varcode{plot} we provided just a single variable containing the y-values of the plot. The x-axis will be scaled from zero to the number of elements in \varcode{y} the x-values are automatically substituted assuming a constant stepsize of 1. This automatic scaling is probably not desired and thus, we need to provide the missing information ourselves. Thus, we need a second variable that contains the respective \varcode{x} values. The length of \varcode{x} and \varcode{y} must be the same otherwise the later call of the \varcode{plot} function will raise an error. The respective call will expand to \code[plot()]{plot(x, y)}. The x-axis will now be scaled from the minimum in \varcode{x} to the maximum of \varcode{x} and by default it will be plotted as a line plot with a solid blue line of the linewidth 1pt. A second plot that is added to the figure will be plotted in red using the same settings. The order of the used colors depends on the \enterm{colormap} settings which can be adjusted to personal taste or need. Table\,\ref{plotlinestyles} shows some predefined values that can be chosen for the line style, the marker, or the color. For additional options consult the help. \begin{table}[htp] \titlecaption{Predefined line styles (left), colors (center) and marker symbols (right).}{}\label{plotlinestyles} \begin{tabular}[t]{lc} \hline \textbf{line styles} & \textbf{abbreviation} \erh \\\hline solid & '\verb|-|' \erb \\ dashed & '\verb|--|' \\ dotted & '\verb|:|' \\ dash-dotted & '\verb|.-|' \\\hline \end{tabular} \hfill \begin{tabular}[t]{lc} \hline \textbf{color} & \textbf{abbreviation} \erh \\ \hline red & 'r' \erb \\ green & 'g' \\ blue & 'b' \\ cyan & 'c' \\ magenta & 'm' \\ yellow & 'y' \\ black & 'k' \\ \hline \end{tabular} \hfill \begin{tabular}[t]{lc} \hline \textbf{marker symbols} & \textbf{abbreviation} \erh \\ \hline circle & 'o' \erb \\ star & '*' \\ plus & '+' \\ cross & 'x' \\ diamond & 'd' \\ pentagram & 'p' \\ hexagram & 'h' \\ square & 's' \\ triangle & '\^{}' \\ inverted triangle & 'v' \\ triangle left & '$<$'\\ triangle right & '$>$'\\\hline \end{tabular} \end{table} The following listing shows a simple line plot with axis labeling and a title \lstinputlisting[caption={A simple plot showing a sinewave.}, label=simpleplotlisting]{simple_plot.m} \subsection{Changing properties of a line plot} The properties of line plots can be changed by passing more arguments to the \varcode{plot} function. The command shown in listing\,\ref{settinglineprops} creates a line plot using the dotted line style, sets the line width to 1.5pt, a red line color is chosen, and star marker symbols is used. Finally, the name of the curve is set to \emph{plot 1} which will be displayed in a legend, if chosen. \begin{lstlisting}[label=settinglineprops, caption={Setting line properties when calling \varcode{plot}.}] x = 0:0.1:2*pi; y = sin(x); plot( x, y, 'color', 'r', 'linestyle', ':', 'marker', '*', 'linewidth', 1.5, 'displayname', 'plot 1') \end{lstlisting} \begin{important}[Choosing the right color.] Choosing the perfect color goes a little bit beyond personal taste. When creating a colored plot you may want to consider the following points: \begin{itemize} \item A substantial amount (about 9\%) of the male population can not distinguish between red and green. \item Can you distinguish the colors in a b/w respectively gray scale print? \item Color figures in publications sometimes cost extra money. \end{itemize} \end{important} \subsection{Changing the axes properties} The first thing a plot needs are axis labels with correct units. By calling the functions \code[xlabel()]{xlabel('Time [ms]')} and \code[ylabel()]{ylabel('Voltage [mV]')} these can be set. By default the axes will be scaled to show the full extent of the data. The extremes will be selected as the closest integer for small values or the next full multiple of tens, hundreds, thousands, etc.\ depending on the maximum value. If these defaults do not match your needs, the limits of the axes can be explicitly set with the functions \code[xlim()]{xlim()} and \code[ylim()]{ylim()}. To do this, the functions expect a single argument, that is a 2-element vector containing the minimum and maximum value. Table\,\ref{plotaxisprops} lists some of the commonly adjusted properties of an axis. To set these properties, we need to have the axes object which can either be stored in a variable when calling \varcode{plot} (\varcode{axes = plot(x,y);}) or can be retrieved using the \code{gca()} function (gca stands for ``get current axes''). Changing the properties of the axes object will update the plot (listing\,\ref{niceplotlisting}). \begin{table}[tp] \titlecaption{Incomplete list of axis properties.}{For a complete list consult the help system or open the property editor when an axis is selected (\figref{ploteditorfig}). If there is a default value of a property it will be listed first.}\label{plotaxisprops} \begin{tabular*}{1\textwidth}{lp{5.8cm}p{5.5cm}} \hline \textbf{property} & \textbf{Description} & \textbf{options} \erh \\ \hline \code{Box} & Defines whether the axes are drawn on all sides. & $\{'on'|'off'\}$ \erb\\ \code{Color} & Background color of the drawing area, not the whole figure. & Any RGB or CMYK values. \\ \code{FontName} & Name of the font used for labeling. & Installed fonts. \\ \code{FontSize} & Size of the font used for labels. & Any scalar value.\\ \code{FontUnit} & Unit in which the font size is given. & $\{'points' | 'centimeters' | 'inches', ...\}$\\ \code{FontWeight} & Bold or normal font. & $\{'normal' | 'bold'\}$\\ \code{TickDir} & Direction of the axis ticks. & $\{'in' | 'out'\}$\\ \code{TickLength} & Length of the ticks. & A scalar value\\ \code{X-, Y-, ZDir} & Direction of axis scaling. Zero bottom/left, or not? & $\{'normal' | 'reversed'\}$\\ \code{X-, Y-, ZGrid} & Defines whether grid lines for the respective axes should be plotted? & $\{'off'|'on'\}$ \\ \code{X-, Y-, ZScale} & Linear of logarithmic scaling? & $\{'linear' | 'log'\}$\\ \code{X-, Y-, ZTick} & Position of the tick marks. & Vector of positions.\\ \code{X-, Y-, ZTickLabel} & Labels that should be use to label the ticks. & Vector of numbers or a cell-array of strings.\\ \hline \end{tabular*} \end{table} \subsection{Changing figure properties} \begin{table}[tp] \titlecaption{Incomplete list of available figure properties.}{For a complete reference consult the \matlab{} help or select the property editor while having the figure background selected (\figref{ploteditorfig}).}\label{plotfigureprops} \begin{tabular*}{1\textwidth}{lp{6.6cm}p{5.7cm}} \hline \textbf{property} & \textbf{description} & \textbf{options} \erh \\ \hline \code{Color} & Background color of the figure, not the drawing area. & Any RGB, CMYK values. \erb \\ \code{PaperPosition} & Position of the axes on the paper. & 4-element vector containing the positions of the bottom-left and top-right corners. \\ \code{PaperSize} & Size of the paper. & 2-element vector defining width and height.\\ \code{PaperUnits} & Unit in which size and position are given. & $\{'inches' | 'centimeters' | 'normalized' | 'points'\}$\\ \code{Visible} & Defines whether the plot should actually be drawn on screen. Useful when plots should not be displayed but directly saved to file. & $\{'on' | 'off'\}$\\ \hline \end{tabular*} \end{table} Like the axes, also the figure has several properties that can be adjusted to the current needs. Most notably the paper (figure) size and the placement of the axes on the paper. Table\,\ref{plotfigureprops} lists commonly used properties. For a complete reference check the help. To change the figure's appearance, we need to change the properties of the figure object which can be retrieved during creation of the figure (\code[figure()]{fig = figure();}) or by using the \code{gcf()} (``get current figure'') command. The script shown in the listing\,\ref{niceplotlisting} exemplifies several features of the plotting system and automatically generates and saves figure\,\ref{spikedetectionfig}. With any execution of this script exactly the same plot will be created. If we decided to plot a different recording, the format will stay exactly the same, just the data changes. Of special interest are the lines 35 through 37 which set the size of the figure and positions the axes on the paper. Lines 24 through 27 control the font used for labeling inside the axes. The axes holds the default \varcode{FontSize} and via multipliers applied to the default one can control the size of the title (line 26) or the axes labels (line 27). Line 40 finally saves the figure in the 'pdf' format to file. When calling the function \code{saveas()} the first argument is the current figure handle, the second the file name, and the last one defines the output format (box\,\ref{graphicsformatbox}). \begin{figure}[t] \includegraphics{spike_detection} \titlecaption{Automatically created plot.}{This plot has been created using the code in listing\,\ref{niceplotlisting}.}\label{spikedetectionfig} \end{figure} \begin{ibox}[t]{\label{graphicsformatbox}File formats for digital artwork.} There are two fundamentally different types of formats for digital artwork: \begin{enumerate} \item \enterm[bitmap]{Bitmaps} (\determ{Rastergrafik}) \item \enterm[vector graphics]{Vector graphics} (\determ{Vektorgrafik}) \end{enumerate} When using bitmaps a color value is given for each pixel of the stored figure. Bitmaps do have a fixed resolution (e.g.\,300\,dpi --- dots per inch), they are very useful for photographs. In the contrary, vector graphics store descriptions of the graphic in terms of so called primitives (lines, circles, polygons, etc.). The main advantage of a vector graphic is that it can be scaled without a loss of quality. \begin{minipage}[t]{0.38\textwidth} \mbox{}\\[-2ex] \includegraphics[width=0.85\textwidth]{VectorBitmap.pdf} \rotatebox{90}{\footnotesize by Darth Stabro at en.wikipedia.org} \end{minipage} \hfill \begin{minipage}[t]{0.5\textwidth} Formats supported by \matlab{} \footnote{more information can be found in the documentation of \code{saveas()}}:\\[2ex] \begin{tabular}{|l|c|l|} \hline \textbf{format} & \textbf{type} & \code{saveas()} \textbf{argument} \erh \\ \hline pdf & vector & \varcode{'pdf'} \erb \\ eps & vector & \varcode{'eps'}, \varcode{'epsc'} \\ SVG & vector & \varcode{'svg'} \\ PS & vector & \varcode{'ps'}, \varcode{'psc'} \\ jpg & bitmap & \varcode{'jpeg'} \\ tif & bitmap & \varcode{'tiff'}, \varcode{'tiffn'} \\ png & bitmap & \varcode{'png'} \\ bmp & bitmap & \varcode{'bmp'} \\ \hline \end{tabular} \end{minipage} It is advisable to store of data plots generated by \matlab{} using a vector graphics format. In doubt they can usually be easily converted to a bitmap format. The way from a bitmap to a vector graphic is not possible without a loss in quality. Storing a plot that contains very large sets of graphical elements (e.g.\,a raster-plot showing thousands of action potentials) may, on the other hand, lead to very large files that can be hard to handle. Saving such plots using a bitmap format may be more efficient. \end{ibox} \lstinputlisting[caption={Script for creating the plot shown in \figref{spikedetectionfig}.}, label=niceplotlisting]{automatic_plot.m} \begin{ibox}[t]{\label{handlevsobjectbox}The wind of change.} The way figure or axis properties can be adapted has been changed with recent \matlab{} versions. In versions before \emph{R2014b} properties could be read and set using the functions \code[get()]{get} and \code[set()]{set}. The first argument these functions expect are valid figure or axis \emph{handles} which were returned by the \code{figure()} and \code{plot()} functions, or could be retrieved using \code{gcf()} or \code{gca()} for the current figure or axis handle, respectively. Subsequent arguments passed to \code{set()} are pairs of a property's name and the desired value. \begin{lstlisting}[caption={Using set to change figure and axis properties.}] frequency = 5; % frequency of the sine wave in Hz time = 0.01:0.01:1.0; % the time axis in seconds signal = sin(2 * pi * time * frequency); plot(time, signal) axes_handle = gca(); % get current axes figure_handle = gcf(); % get current figure set(axes_handle, 'XLabel', 'time [s]', 'YLabel', 'amplitude'); set(figure_handle, 'PaperSize', [5.5, 5.5], 'PaperUnit', 'centimeters', ... 'PaperPosition', [0, 0, 5.5, 5.5]); \end{lstlisting} With newer versions the handles returned by \code{gcf()} and \code{gca()} are ``objects'' and setting properties became much easier as it is used throughout this chapter. For downward compatibility with older versions set and get still work in current versions of \matlab{}. \end{ibox} \section{Plot examples} So far we have introduced the standard line plots. Next to these there are many more options to display scientific data. Mathworks shows various examples and the respective code on their website \url{http://www.mathworks.de/discovery/gallery.html}. For some types of plots we present examples in the following sections. \subsection{Scatter} For displaying events or pairs of x-y coordinates the standard line plot is not optimal. Rather, we use \code{scatter()} for this purpose. For example, we have a number of measurements of a system's response to a certain stimulus intensity. There is no dependency between the data points, drawing them with a line-plot would be nonsensical (figure\,\ref{scatterplotfig}\,A). In contrast to \code{plot()} we need to provide x- and y-coordinates in order to draw the data. In the example we also provide further arguments to set the size, color of the dots and specify that they are filled (listing\,\ref{scatterlisting1}). \lstinputlisting[caption={Creating a scatter plot with red filled dots.}, label=scatterlisting1, firstline=9, lastline=9]{scatterplot.m} We could have used plot for this purpose and set the marker to something and the line-style to ``none'' to draw an equivalent plot. Scatter, however offers some more advanced features that allows to add two more dimensions to the plot (figure\,\ref{scatterplotfig}\,B,\,C). For each dot one can define an individual size and color. In this example the size argument is simply a vector of the same size as the data that contains number from 1 to the length of 'x' (line 1 in listing\,\ref{scatterlisting2}). To manipulate the color we need to specify a length(x)-by-3 matrix. For each dot we provide an individual color (i.e. the RGB triplet in each row of the color matrix, lines 2-4 in listing\,\ref{scatterlisting2}) \lstinputlisting[caption={Creating a scatter plot with size and color variations. The RGB triplets define the respective color intensity in a range 0:1. Here, we modify only the red color channel.}, label=scatterlisting2, linerange={15-15, 21-23}]{scatterplot.m} \begin{figure}[t] \includegraphics{scatterplot} \titlecaption{Scatterplots.}{Scatterplots are used to draw datapoints where there is no direct dependency between the individual measurements (like time). Scatter offers several advantages over the standard plot command. One can vary the size and/or the color of each dot.}\label{scatterplotfig} \end{figure} \subsection{Subplots} A very common scenario is to combine several plots in the same figure. To do this we create so-called subplots figures\,\ref{regularsubplotsfig},\,\ref{irregularsubplotsfig}. The \code[subplot()]{subplot()} command allows to place multiple axes onto a single sheet of paper. Generally, \code{subplot()} expects three argument defining the number of rows, column, and the currently active plot. The currently active plot number starts with 1 and goes up to $rows \cdot columns$ (numbers in the subplots in figures\,\ref{regularsubplotsfig}, \ref{irregularsubplotsfig}). \begin{figure}[t] \includegraphics[width=0.5\linewidth]{regular_subplot} \titlecaption{Subplots placed on a regular grid.}{By default all subplots have the same size. See listing\,\ref{regularsubplotlisting}. Subplot labeling has been created using the \code[text()]{text()} annotation function (see also below).}\label{regularsubplotsfig} \end{figure} \lstinputlisting[caption={Script for creating subplots in a regular grid \figref{regularsubplotsfig}.}, label=regularsubplotlisting, basicstyle=\ttfamily\scriptsize]{regular_subplot.m} By default, all subplots have the same size, if something else is desired, e.g.\ one subplot should span a whole row, while two others are smaller and should be placed side by side in the same row, the third argument of \code{subplot()} can be a vector or numbers that should be joined. These have, of course, to be adjacent numbers (\figref{irregularsubplotsfig}, listing\,\ref{irregularsubplotslisting}). \begin{figure}[ht] \includegraphics[width=0.5\linewidth]{irregular_subplot} \titlecaption{Subplots of different size.}{The third argument of \varcode{subplot} may be a vector of cells that should be joined into the same subplot. See listing\,\ref{irregularsubplotslisting}}\label{irregularsubplotsfig} \end{figure} Not all cells of the grid, defined by the number of rows and columns, need to be used in a plot. If you want to create something more elaborate, or have more spacing between the subplots one can create a grid with larger numbers of columns and rows, and specify the used cells of the grid by passing a vector as the third argument to \code{subplot()}. \lstinputlisting[caption={Script for creating subplots of different sizes \figref{irregularsubplotsfig}.}, label=irregularsubplotslisting, basicstyle=\ttfamily\scriptsize]{irregular_subplot.m} \subsection{Show estimation errors} The repeated measurements of a quantity almost always results in varying results. Neuronal activity, for example is notoriously noisy. The responses of a neuron to repeated stimulation with the same stimulus may share common features but are different each time. This is the reason we calculate measures that describe the variability of such as the standard deviation and thus need a way to illustrate it in plots of scientific data. Providing an estimate of the error gives the reader the chance of assessing the reliability of the data and get a feeling of possible significance of a difference in the average values. \matlab{} offers several ways to plot the average and the error. We will introduce two possible ways. \begin{itemize} \item The \code[errorbar()]{errorbar} function (figure\,\ref{errorbarplot} A, B). \item Using the \code[fill()]{fill} function to draw an area showing the spread of the data (figure\,\ref{errorbarplot} C). \end{itemize} \subsubsection{Errorbar} Using the \code[errorbar()]{errorbar} function is rather straight forward. In its easiest form, it expects three arguments being the x- and y-values plus the error (line 5 in listing \ref{errorbarlisting}, note that we provide additional optional arguments to set the marker). This form is obviously only suited for symmetric distributions. In case the values are symmetrically distributed, a separate error for positive and negative deflections from the mean are more apt. Accordingly, four arguments are needed (line 12 in listing \ref{errorbarlisting}). The first two arguments are the same, the next to represent the positive and negative deflections. By default the \code{errorbar()} function does not draw a marker. In the examples shown here we provide extra arguments to define that a circle is used for that purpose. The line connecting the average values can be removed by passing additional arguments. The properties of the errorbars themselves (linestyle, linewidth, capsize, etc.) can be changed by taking the return argument of \code{errorbar()} and changing its properties. See the \matlab{} help for more information. \begin{figure}[ht] \includegraphics[width=0.9\linewidth]{errorbars} \titlecaption{Indicating the estimation error in plots.}{\textbf{A} symmetrical error around the mean (e.g.\ using the standard deviation). \textbf{B} Errorbars of an asymmetrical distribution of the data (note: the average value is now the median and the errors are the lower and upper quartiles). \textbf{C} A shaded area is used to illustrate the spread of the data. See listing\,\ref{errorbarlisting} for A and C and listing\,\ref{errorbarlisting2} }\label{errorbarplot} \end{figure} \lstinputlisting[caption={Illustrating estimation errors using error bars. Script that creates \figref{errorbarplot}. A, B}, label=errorbarlisting, firstline=13, lastline=31, basicstyle=\ttfamily\scriptsize]{errorbarplot.m} \subsubsection{Fill} For a few years now it has become fancy to illustrate the error not using errorbars but by drawing a shaded area around the mean. Beside the fancyness there is also a real argument in favor of using error areas instead of errorbars: In case you have a lot of data points with respective errorbars such that they would merge in the figure it is cleaner and probably easier to read and handle if one uses an error area instead. To achieve an illustration as shown in figure\,\ref{errorbarplot} C, we use the \code{fill()} command in combination with a standard line plot. The original purpose of \code{fill()} is to draw a filled polygon. We hence have to provide it with the vertex points of the polygon. For each x-value we now have two y-values (average minus error and average plus error). Further, we want the vertices to be connected in a defined order. One can achieve this by going back and forth on the x-axis; we append a reversed version of the x-values to the original x-values using \code{cat()} and \code{fliplr()} for concatenation and inversion, respectively (line 3 in listing \ref{errorbarlisting2}; Depending on the layout of your data you may need concatenate along a different dimension of the data and use \code{flipud()} instead). The y-coordinates of the polygon vertices are concatenated in a similar way (line 4). In the example shown here we accept the polygon object that is returned by fill (variable p) and use it to change a few properties of the polygon. The \emph{FaceAlpha} property defines the transparency (or rather the opaqueness) of the area. The provided alpha value is a number between 0 and 1 with zero leading to invisibility and a value of one to complete opaqueness. Finally, we use the normal plot command to draw a line connecting the average values (line 12). \lstinputlisting[caption={Illustrating estimation errors using a shaded area. Script that creates \figref{errorbarplot} C.}, label=errorbarlisting2, firstline=33, basicstyle=\ttfamily\scriptsize]{errorbarplot.m} \subsection{Annotations, text} The \code[text()]{text()} or \code[annotation()]{annotation()} are used for highlighting certain parts of a plot or simply adding an annotation that does not fit or does not belong into the legend. While \code{text()} simply prints out the given text string at the defined position (for example line in listing\,\ref{regularsubplotlisting}) the \code{annotation()} function allows to add some more advanced highlights like arrows, lines, ellipses, or rectangles. Figure\,\ref{annotationsplot} shows some examples, the respective code can be found in listing\,\ref{annotationsplotlisting}. For more options consult the \matlab{} help. \begin{figure}[ht] \includegraphics[width=0.5\linewidth]{annotations} \titlecaption{Annotations in a plot.}{See listing\,\ref{annotationsplotlisting}}\label{annotationsplot} \end{figure} \lstinputlisting[caption={Adding annotations to figures. Script that creates \figref{annotationsplot}.}, label=annotationsplotlisting, basicstyle=\ttfamily\scriptsize]{annotations.m} \begin{important}[Positions in data or figure coordinates.] A very confusing pitfall are the different coordinate systems used by \varcode{text()} and \varcode{annotation()}. While \varcode{text()} expects the positions to be in data coordinates, i.e.\,in the limits of the x- and y-axis, \varcode{annotation()} requires the positions to be given in normalized figure coordinates. Normalized means that the width and height of the figure are expressed by numbers in the range 0 to 1. The bottom/left corner then has the coordinates $(0,0)$ and the top/right corner the $(1,1)$. Why different coordinate systems? Using data coordinates is convenient for annotations within a plot, but what about an arrow that should be drawn between two subplots? \end{important} \subsection{Animations and movies} A picture is worth a thousand words and sometimes creating animations or movies is worth many pictures. They can help understanding complex or time-dependent developments and may add some variety to a presentation. The following example shows how a movie can be created and saved to file. A similar mechanism is available to produce animations that are supposed to be shown within \matlab{} but for this we point to the documentation of the \code[movie()]{movie()} command. The underlying principle is the same, however. The code shown in listing\,\ref{animationlisting} creates an animation of a Lissajous figure. The basic steps are: \begin{enumerate} \item Create a figure and set some basic properties (lines 7 --- 10). \item Create a \code[VideoWriter()]{VideoWriter} object that, in this example, takes the filename and the profile, the mpg-4 compression profile, as arguments (line 12). For more options see the documentation. \item We can set the desired framerate and the quality of the video (lines 13, 14). Quality is a value between 0 and 100, where 100 is the best quality but leads to the largest files. The framerate defines how quickly the individual frames will switched. In our example, we create 500 frames and the video framerate is 25\,Hz. That is, the movie will have a duration of $500/25 = 20$\,seconds. \item Open the destination file (line 16). Opening means that the file is created and opened for writing. This also implies that is has to be closed after the whole process (line 31). \item For each frame of the video, we plot the appropriate data (we use \code{scatter()} for this purpose, line 20) and ``grab'' the frame (line 28). Grabbing is similar to making a screenshot of the figure. The \code{drawnow()} command (line 27) is used to stop the excution of the for loop until the drawing process is finished. \item Write the frame to file (line 29). \item Finally, close the file (line 31). \end{enumerate} \lstinputlisting[caption={Making animations and saving them as a movie.}, label=animationlisting, firstline=16, lastline=36, basicstyle=\ttfamily\scriptsize]{movie_example.m} \section{What makes a good plot?} Plot should help/enable the interested reader to get a grasp of the data and to understand the performed analysis and to critically assess the presented results. The most important rule is the correct and complete annotation of the plots. This starts with axis labels and units and extends to legends. Incomplete annotation can have terrible consequences (\figref{xkcdplotting}). The principle of \emph{ink minimization} may be used as a guiding principle for appealing plots. It requires that the relation of amount of ink spent on the data and that spent on other parts of the plot should be strongly in favor of the data. Ornamental or otherwise unnecessary gimicks should not be used in scientific contexts. An exception can be made if the particular figure was designed for didactic purposes and sometimes for presentations. \begin{important}[Correct labeling of plots] A data plot must be sufficiently labeled: \begin{itemize} \item Every axis must have a label and the correct unit, if it has one.\\ (e.g. \code[xlabel()]{xlabel('Speed [m/s]'}). \item When more than one line is plotted, they have to be labeled using the figure legend, or similar \matlabfun{legend()}. \item If using subplots that show similar information on the axes, they should be scaled to show the same ranges to ease comparison between plots. (e.g. \code[xlim()]{xlim([0 100])}.\\ If one chooses to ignore this rule one should explicitly state this in the figure caption and/or the descriptions in the text. \item Labels must be large enough to be readable. In particular, when using the figure in a presentation use large enough fonts. \end{itemize} \end{important} \section{Things that should be avoided.} When plotting scientific data we should take great care to avoid suggestive or misleading presentations. Unnecessary additions and fancy graphical effects make a plot frivolous and also violate the \emph{ink minimization principle}. Illustrations in comic style (\figref{comicexamplefig}) are not suited for scientific data in most instances. For presentations or didactic purposes, however, using a comic style may be helpful to indicate that the figure is a mere sketch and the exact position of the data points is of no importance. \begin{figure}[t] \includegraphics[width=0.7\columnwidth]{outlier}\vspace{-3ex} \titlecaption{Comic-like illustration.}{Obviously not suited to present scientific data. In didactic or illustrative contexts they can be helpful to focus on the important aspects.}\label{comicexamplefig} \end{figure} The following figures show examples of misleading or suggestive presentations of data. Several of the effects have been exaggerated to make the point. A little more subtlety these methods are employed to nudge the viewers experience into the desired direction. You can find more examples on \url{https://en.wikipedia.org/wiki/Misleading_graph}. \begin{figure}[p] \includegraphics[width=0.35\textwidth]{misleading_pie} \hspace{0.05\textwidth} \includegraphics[width=0.35\textwidth]{sample_pie} \titlecaption{Perspective distortion influences the perceived size.}{By changing the perspective of the 3-D illustration the highlighted segment \textbf{C} gains more weight than it should have. In the left graph segments \textbf{A} and \textbf{C} appear very similar. The 2-D plot on the right-hand side shows that this is an illusion. \url{https://en.wikipedia.org/wiki/Misleading_graph}}\label{misleadingpiefig} \end{figure} \begin{figure}[p] \includegraphics[width=0.9\textwidth]{plot_scaling.pdf} \titlecaption{Choosing the figure format and scaling of the axes influences the perceived strength of a correlation.}{All subplots show the same data. By choosing a certain figure size we can pronounce or reduce the perceived strength of the correlation in the data. Technically all three plots are correct. }\label{misleadingscalingfig} \end{figure} \begin{figure}[p] \begin{minipage}[t]{0.3\textwidth} \includegraphics[width=0.8\textwidth]{improperly_scaled_graph} \end{minipage} \begin{minipage}[t]{0.3\textwidth} \includegraphics[width=0.8\textwidth]{comparison_properly_improperly_graph} \end{minipage} \begin{minipage}[t]{0.3\textwidth} \includegraphics[width=0.7\textwidth]{properly_scaled_graph} \end{minipage} \titlecaption{Scaling of markers and symbols.} {In these graphs symbols have been used to illustrate the measurements made in two categories. The measured value for category \textbf{B} is actually three times the measured value for category \textbf{A}. In the left graph the symbol for category \textbf{B} has been scaled to triple height while maintaining the proportions. This appears just fair and correct but leads to the effect that the covered surface is not increased to the 3-fold but the 9-fold (center plot). The plot on the right shows how it could have been done correctly. \url{https://en.wikipedia.org/wiki/Misleading_graph}}\label{misleadingsymbolsfig} \end{figure} By using perspective effects in 3-D plot the perceived size can be distorted into the desired direction. While the plot is correct in a strict sense it is rather suggestive (\figref{misleadingpiefig}). Similarly the choice of figure size and proportions can lead to different interpretations of the data. Stretching the y-extent of a graph leads to a stronger impression of the correlation in the data. Compressing this axis will lead to a much weaker perceived correlation (\figref{misleadingscalingfig}). When using symbols to illustrate a quantity we have to take care not to overrate of difference due to symbol scaling (\figref{misleadingsymbolsfig}). \section{Summary} A good plot of scientific data displays the data completely and seriously without too many distractions. Misleading or suggestive plots as may result from perspective presentations, inappropriate scaling of axes and symbols should be avoided. \noindent When combining several line plots within the same figure one should consider adapting color \textbf{and} line style (solid, dashed, dotted. etc.) to make the distinguishable even in black-and-white prints. Combinations of red and green are not a good choice since they cannot be distinguished by people with red-green blindness. \vspace{2ex} Key ingredients for a good data plot: \begin{itemize} \item Clearness. \item Complete labeling. \item Plotted lines and curves must be distinguishable. \item No suggestive or misleading presentation. \item The right balance of line width, font size and size of the figure, this may depend on the purpose, for presentations slightly thicker lines help. \item Error bars wherever they are appropriate. \end{itemize} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %\printsolutions