[statistics] added new exercise univariatedata.m

This commit is contained in:
Jan Benda 2019-11-25 22:49:03 +01:00
parent 82b4d5e080
commit 4b9d1134fe
2 changed files with 28 additions and 8 deletions

View File

@ -0,0 +1,9 @@
data = 2.0 + randn(40, 1);
bw = 0.8
boxplot(data)
hold on;
bar(2.0, mean(data), 0.5*bw);
errorbar(2.0, mean(data), std(data));
scatter(2.5+bw*rand(length(data), 1), data);
hold off;
xlim([0.2, 4.0])

View File

@ -3,7 +3,6 @@
\chapter{Descriptive statistics}
Descriptive statistics characterizes data sets by means of a few measures.
In addition to histograms that estimate the full distribution of the data,
the following measures are used for characterizing univariate data:
\begin{description}
@ -20,7 +19,7 @@ For bivariate and multivariate data sets we can also analyse their
Spearman's rank correlation coefficient.
\end{description}
The following is not a complete introduction to descriptive
The following is in no way a complete introduction to descriptive
statistics, but summarizes a few concepts that are most important in
daily data-analysis problems.
@ -63,10 +62,12 @@ used to illustrate the standard deviation of the data
uniformly distributed random numbers \matlabfun{rand()}. (2) With
a bar plot \matlabfun{bar()} one usually shows the mean of the
data. The additional errorbar illustrates the deviation of the
data from the mean by $\pm$ one standard deviation. (3) A
data from the mean by $\pm$ one standard deviation. In case of
non-normal data mean and standard deviation only poorly
characterize the distribution of the data values. (3) A
box-whisker plot \matlabfun{boxplot()} shows more details of the
distribution of the data values. The box extends from the 1. to
the 3. quartile, a horizontal ine within the box marks the median
the 3. quartile, a horizontal line within the box marks the median
value, and the whiskers extend to the minum and the maximum data
values. (4) The probability density $p(x)$ estimated from a
normalized histogram shows the entire distribution of the
@ -151,12 +152,22 @@ that extends from the 1$^{\rm st}$ to the 3$^{\rm rd}$ quartile. The
whiskers mark the minimum and maximum value of the data set
(\figref{displayunivariatedatafig} (3)).
\begin{exercise}{boxwhisker.m}{}
Generate eine $40 \times 10$ matrix of random numbers and
illustrate their distribution in a box-whicker plot
(\code{boxplot()} function). How to interpret the plot?
\begin{exercise}{univariatedata.m}{}
Generate 40 normally distributed random numbers with a mean of 2 and
illustrate their distribution in a box-whisker plot
(\code{boxplot()} function), with a bar and errorbar illustrating
the mean and standard deviation (\code{bar()}, \code{errorbar()}),
and the data themselves jittered randomly (as in
\figref{displayunivariatedatafig}). How to interpret the different
plots?
\end{exercise}
% \begin{exercise}{boxwhisker.m}{}
% Generate a $40 \times 10$ matrix of random numbers and
% illustrate their distribution in a box-whisker plot
% (\code{boxplot()} function). How to interpret the plot?
% \end{exercise}
\section{Distributions}
The distribution of values in a data set is estimated by histograms
(\figref{displayunivariatedatafig} (4)).