[statistics] added new exercise univariatedata.m

This commit is contained in:
Jan Benda 2019-11-25 22:49:03 +01:00
parent 82b4d5e080
commit 4b9d1134fe
2 changed files with 28 additions and 8 deletions

View File

@ -0,0 +1,9 @@
data = 2.0 + randn(40, 1);
bw = 0.8
boxplot(data)
hold on;
bar(2.0, mean(data), 0.5*bw);
errorbar(2.0, mean(data), std(data));
scatter(2.5+bw*rand(length(data), 1), data);
hold off;
xlim([0.2, 4.0])

View File

@ -3,7 +3,6 @@
\chapter{Descriptive statistics} \chapter{Descriptive statistics}
Descriptive statistics characterizes data sets by means of a few measures. Descriptive statistics characterizes data sets by means of a few measures.
In addition to histograms that estimate the full distribution of the data, In addition to histograms that estimate the full distribution of the data,
the following measures are used for characterizing univariate data: the following measures are used for characterizing univariate data:
\begin{description} \begin{description}
@ -20,7 +19,7 @@ For bivariate and multivariate data sets we can also analyse their
Spearman's rank correlation coefficient. Spearman's rank correlation coefficient.
\end{description} \end{description}
The following is not a complete introduction to descriptive The following is in no way a complete introduction to descriptive
statistics, but summarizes a few concepts that are most important in statistics, but summarizes a few concepts that are most important in
daily data-analysis problems. daily data-analysis problems.
@ -63,10 +62,12 @@ used to illustrate the standard deviation of the data
uniformly distributed random numbers \matlabfun{rand()}. (2) With uniformly distributed random numbers \matlabfun{rand()}. (2) With
a bar plot \matlabfun{bar()} one usually shows the mean of the a bar plot \matlabfun{bar()} one usually shows the mean of the
data. The additional errorbar illustrates the deviation of the data. The additional errorbar illustrates the deviation of the
data from the mean by $\pm$ one standard deviation. (3) A data from the mean by $\pm$ one standard deviation. In case of
non-normal data mean and standard deviation only poorly
characterize the distribution of the data values. (3) A
box-whisker plot \matlabfun{boxplot()} shows more details of the box-whisker plot \matlabfun{boxplot()} shows more details of the
distribution of the data values. The box extends from the 1. to distribution of the data values. The box extends from the 1. to
the 3. quartile, a horizontal ine within the box marks the median the 3. quartile, a horizontal line within the box marks the median
value, and the whiskers extend to the minum and the maximum data value, and the whiskers extend to the minum and the maximum data
values. (4) The probability density $p(x)$ estimated from a values. (4) The probability density $p(x)$ estimated from a
normalized histogram shows the entire distribution of the normalized histogram shows the entire distribution of the
@ -151,12 +152,22 @@ that extends from the 1$^{\rm st}$ to the 3$^{\rm rd}$ quartile. The
whiskers mark the minimum and maximum value of the data set whiskers mark the minimum and maximum value of the data set
(\figref{displayunivariatedatafig} (3)). (\figref{displayunivariatedatafig} (3)).
\begin{exercise}{boxwhisker.m}{} \begin{exercise}{univariatedata.m}{}
Generate eine $40 \times 10$ matrix of random numbers and Generate 40 normally distributed random numbers with a mean of 2 and
illustrate their distribution in a box-whicker plot illustrate their distribution in a box-whisker plot
(\code{boxplot()} function). How to interpret the plot? (\code{boxplot()} function), with a bar and errorbar illustrating
the mean and standard deviation (\code{bar()}, \code{errorbar()}),
and the data themselves jittered randomly (as in
\figref{displayunivariatedatafig}). How to interpret the different
plots?
\end{exercise} \end{exercise}
% \begin{exercise}{boxwhisker.m}{}
% Generate a $40 \times 10$ matrix of random numbers and
% illustrate their distribution in a box-whisker plot
% (\code{boxplot()} function). How to interpret the plot?
% \end{exercise}
\section{Distributions} \section{Distributions}
The distribution of values in a data set is estimated by histograms The distribution of values in a data set is estimated by histograms
(\figref{displayunivariatedatafig} (4)). (\figref{displayunivariatedatafig} (4)).