199 lines
9.8 KiB
TeX
199 lines
9.8 KiB
TeX
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\chapter{\tr{Bootstrap methods}{Bootstrap Methoden}}
|
|
\label{bootstrapchapter}
|
|
|
|
\selectlanguage{english}
|
|
|
|
Bootstrapping methods are applied to create distributions of
|
|
statistical measures via resampling of a sample. Bootstrapping offers several
|
|
advantages:
|
|
\begin{itemize}
|
|
\item Fewer assumptions (e.g. a measured sample does not need to be
|
|
normally distributed).
|
|
\item Increased precision as compared to classical methods. %such as?
|
|
\item General applicability: The bootstrapping methods are very
|
|
similar for different statistics and there is no need to specialize
|
|
the method depending on the investigated statistic measure.
|
|
\end{itemize}
|
|
|
|
\begin{figure}[tp]
|
|
\includegraphics[width=0.8\textwidth]{2012-10-29_16-26-05_771}\\[2ex]
|
|
\includegraphics[width=0.8\textwidth]{2012-10-29_16-41-39_523}\\[2ex]
|
|
\includegraphics[width=0.8\textwidth]{2012-10-29_16-29-35_312}
|
|
\titlecaption{\label{statisticalpopulationfig} Why can't we measure
|
|
the statistical population but only draw samples?}{}
|
|
\end{figure}
|
|
|
|
Reminder: in statistics we are interested in properties of the
|
|
``statistical population'' (in German: \determ{Grundgesamtheit}), e.g. the
|
|
average length of all pickles (\figref{statisticalpopulationfig}). But
|
|
we cannot measure the lengths of all pickles in the statistical
|
|
population. Rather, we draw samples (simple random sample
|
|
\enterm[SRS|see{simple random sample}]{SRS}, in German:
|
|
\determ{Stichprobe}). We then estimate a statistical measures
|
|
(e.g. the average length of the pickles) within in this sample and
|
|
hope that it is a good approximation of the unknown and immeasurable
|
|
real average length of the statistical population (in German aka
|
|
\determ{Populationsparameter}). We apply statistical methods to find
|
|
out how good this approximation is.
|
|
|
|
If we could draw a large number of \textit{simple random samples} we could
|
|
estimate the statistical measure of interest for each sample and
|
|
estimate the probability distribution using a histogram. This
|
|
distribution is called the \enterm{sampling distribution} (German:
|
|
\determ{Stichprobenverteilung},
|
|
\subfigref{bootstrapsamplingdistributionfig}{a}).
|
|
|
|
\begin{figure}[tp]
|
|
\includegraphics[height=0.2\textheight]{srs1}\\[2ex]
|
|
\includegraphics[height=0.2\textheight]{srs2}\\[2ex]
|
|
\includegraphics[height=0.2\textheight]{srs3}
|
|
\titlecaption{\label{bootstrapsamplingdistributionfig}Bootstrapping
|
|
the sampling distribution.}{(a) Simple random samples (SRS) are
|
|
drawn from a statistical population with an unknown population
|
|
parameter (e.g. the average $\mu$). The statistical measure (the
|
|
estimation of $\bar x$) is calculated for each sample. The
|
|
measured values originate from the sampling distribution. Often
|
|
only a single random sample is drawn! (b) By applying assumption
|
|
and theories one can guess the sampling distribution without
|
|
actually measuring it. (c) Alternatively, one can generate many
|
|
bootstrap-samples from the same SRS (resampling) and use these to
|
|
estimate the sampling distribution empirically. From Hesterberg et
|
|
al. 2003, Bootstrap Methods and Permutation Tests}
|
|
\end{figure}
|
|
|
|
Commonly, there will be only a single SRS. In such cases we make use
|
|
of certain assumptions (e.g. we assume a normal distribution) that
|
|
allow us to infer the precision of our estimation based on the
|
|
SRS. For example the formula $\sigma/\sqrt{n}$ gives the standard
|
|
error of the mean which is the standard deviation of the distribution
|
|
of average values around the mean of the statistical population
|
|
estimated in many SRS
|
|
(\subfigref{bootstrapsamplingdistributionfig}{b}).
|
|
%explicitely state that this is based on the assumption of a normal distribution?
|
|
|
|
Alternatively, we can use ``bootstrapping'' to generate new samples
|
|
from the one set of measurements (resampling). From these bootstrapped
|
|
samples we calculate the desired statistical measure and estimate
|
|
their distribution (\enterm{bootstrap distribution},
|
|
\subfigref{bootstrapsamplingdistributionfig}{c}). Interestingly, this
|
|
distribution is very similar to the sampling distribution regarding
|
|
its width. The only difference is that the bootstrapped values are
|
|
distributed around the measure of the original sample and not the one
|
|
of the statistical population. We can use the bootstrap distribution
|
|
to draw conclusion regarding the precision of our estimation (e.g.
|
|
standard errors and confidence intervals).
|
|
|
|
Bootstrapping method create new SRS by resampling to estimate the
|
|
sampling distribution of a statistical measure. The bootstrapped
|
|
samples have the same size as the original sample and are created by
|
|
sampling with replacement, that is, each value of the original sample
|
|
can occur once, multiple time, or not at all in a bootstrapped sample.
|
|
|
|
|
|
\section{Bootstrap of the standard error}
|
|
|
|
Bootstrapping can be nicely illustrated at the example the standard
|
|
error of the mean. The arithmetic mean is calculated for a simple
|
|
random sample. The standard error of the mean is the standard
|
|
deviation of the expected distribution of mean values around the mean
|
|
of the statistical population.
|
|
|
|
\begin{figure}[tp]
|
|
\includegraphics[width=1\textwidth]{bootstrapsem}
|
|
\titlecaption{\label{bootstrapsemfig}Bootstrapping the standard
|
|
error of the mean.}{The --- usually unknown --- sampling
|
|
distribution of the mean is distributed around the true mean of
|
|
the statistical population ($\mu=0$, red). The bootstrap
|
|
distribution of the means calculated for many bootstrapped samples
|
|
has the same shape as the sampling distribution but is centered
|
|
around the mean of the SRS used for resampling. The standard
|
|
deviation of the bootstrap distribution (blue) is thus an estimator for
|
|
the standard error of the mean.}
|
|
\end{figure}
|
|
|
|
Via bootstrapping we create a distribution of the mean values
|
|
(\figref{bootstrapsemfig}) and the standard deviation of this
|
|
distribution is the standard error of the mean.
|
|
|
|
\pagebreak[4]
|
|
\begin{exercise}{bootstrapsem.m}{bootstrapsem.out}
|
|
Create the distribution of mean values from bootstrapped samples
|
|
resampled form a single SRS. Use this distribution to estimate the
|
|
standard error of the mean.
|
|
\begin{enumerate}
|
|
\item Draw 1000 normally distributed random number and calculate the
|
|
mean, the standard deviation and the standard error
|
|
($\sigma/\sqrt{n}$).
|
|
\item Resample the data 1000 times (draw and replace) and calculate
|
|
the mean of each bootstrapped sample.
|
|
\item Plot a histogram of the respective distribution and calculate its mean and
|
|
standard deviation. Compare with the
|
|
original values based on the statistical population.
|
|
\end{enumerate}
|
|
\end{exercise}
|
|
|
|
|
|
\section{Permutationtests}
|
|
Statistical tests ask for the probability that a measured value
|
|
originates from the null hypothesis. Is this probability smaller than
|
|
the desired significance level, the null hypothesis may be rejected.
|
|
|
|
Traditionally, such probabilities are taken from theoretical
|
|
distributions which are based on assumptions about the data. Thus the
|
|
applied statistical test has to be appropriate for the type of
|
|
data. An alternative approach is to calculate the probability density
|
|
of the null hypothesis directly from the data itself. To do this, we
|
|
need to resample the data according to the null hypothesis from the
|
|
SRS. By such permutation operations we destroy the feature of interest
|
|
while we conserve all other features of the data.
|
|
|
|
\begin{figure}[tp]
|
|
\includegraphics[width=1\textwidth]{permutecorrelation}
|
|
\titlecaption{\label{permutecorrelationfig}Permutation test for
|
|
correlations.}{Let the correlation coefficient of a dataset with
|
|
200 samples be $\rho=0.21$. The distribution of the null
|
|
hypothesis, yielded from the correlation coefficients of
|
|
permuted and uncorrelated datasets is centered around zero
|
|
(yellow). The measured correlation coefficient is larger than the
|
|
95\,\% percentile of the null hypothesis. The null hypothesis may
|
|
thus be rejected and the measured correlation is statistically
|
|
significant.}
|
|
\end{figure}
|
|
|
|
A good example for the application of a permutaion test is the
|
|
statistical assessment of correlations. Given are measured pairs of
|
|
data points $(x_i, y_i)$. By calculating the correlation coefficient
|
|
we can quantify how strongly $y$ depends on $x$. The correlation
|
|
coefficient alone, however, does not tell whether it is statistically
|
|
significantly different from a random correlation. The null hypothesis
|
|
for such a situation would be that $y$ does not depend on $x$. In
|
|
order to perform a permutation test, we now destroy the correlation by
|
|
permuting the $(x_i, y_i)$ pairs, i.e. we rearrange the $x_i$ and
|
|
$y_i$ values in a random fashion. By creating many sets of random
|
|
pairs and calculating the resulting correlation coefficients, we yield
|
|
a distribution of correlation coefficients that are a result of
|
|
randomness. From this distribution we can directly measure the
|
|
statistical significance (figure\,\ref{permutecorrelationfig}).
|
|
|
|
|
|
\begin{exercise}{correlationsignificance.m}{correlationsignificance.out}
|
|
Estimate the statistical significance of a correlation coefficient.
|
|
\begin{enumerate}
|
|
\item Create pairs of $(x_i, y_i)$ values. Randomly choose $x$-values
|
|
and calculate the respective $y$-values according to $y=0.2 \cdot x$
|
|
to which you add a random value drawn from a normal distribution.
|
|
\item Calculate the correlation coefficient.
|
|
\item Generate the distribution according to the null hypothesis by
|
|
generating uncorrelated pairs. For this permute $x$- and $y$-values
|
|
(\matlabfun{randperm()}) 1000 times and calculate for each
|
|
permutation the correlation coefficient.
|
|
\item From the resulting null hypothesis distribution the 95\,\%
|
|
percentile and compare it with the correlation coefficient
|
|
calculated for the original data.
|
|
\end{enumerate}
|
|
\end{exercise}
|
|
|
|
\selectlanguage{english}
|