[Bootstrap] language fixes

This commit is contained in:
Jan Grewe 2019-10-12 12:01:18 +02:00
parent 39ed3c716a
commit eef9e5f87d
2 changed files with 22 additions and 21 deletions

View File

@ -31,15 +31,15 @@ average length of all pickles (\figref{statisticalpopulationfig}). But
we cannot measure the lengths of all pickles in the statistical
population. Rather, we draw samples (simple random sample
\enterm[SRS|see{simple random sample}]{SRS}, in German:
\determ{Stichprobe}). We then estimate a statistical measures
(e.g. the average length of the pickles) within in this sample and
\determ{Stichprobe}). We then estimate a statistical measure of interest
(e.g. the average length of the pickles) within this sample and
hope that it is a good approximation of the unknown and immeasurable
real average length of the statistical population (in German aka
\determ{Populationsparameter}). We apply statistical methods to find
out how good this approximation is.
out how precise this approximation is.
If we could draw a large number of \textit{simple random samples} we could
estimate the statistical measure of interest for each sample and
calculate the statistical measure of interest for each sample and
estimate the probability distribution using a histogram. This
distribution is called the \enterm{sampling distribution} (German:
\determ{Stichprobenverteilung},
@ -85,16 +85,17 @@ of the statistical population. We can use the bootstrap distribution
to draw conclusion regarding the precision of our estimation (e.g.
standard errors and confidence intervals).
Bootstrapping method create new SRS by resampling to estimate the
sampling distribution of a statistical measure. The bootstrapped
samples have the same size as the original sample and are created by
sampling with replacement, that is, each value of the original sample
can occur once, multiple time, or not at all in a bootstrapped sample.
Bootstrapping methods create bootstrapped samples from a SRS by
resampling. The bootstrapped samples are used to estimate the sampling
distribution of a statistical measure. The bootstrapped samples have
the same size as the original sample and are created by randomly drawing with
replacement, that is, each value of the original sample can occur
once, multiple time, or not at all in a bootstrapped sample.
\section{Bootstrap of the standard error}
Bootstrapping can be nicely illustrated at the example the standard
Bootstrapping can be nicely illustrated at the example of the standard
error of the mean. The arithmetic mean is calculated for a simple
random sample. The standard error of the mean is the standard
deviation of the expected distribution of mean values around the mean
@ -120,13 +121,13 @@ distribution is the standard error of the mean.
\pagebreak[4]
\begin{exercise}{bootstrapsem.m}{bootstrapsem.out}
Create the distribution of mean values from bootstrapped samples
resampled form a single SRS. Use this distribution to estimate the
resampled from a single SRS. Use this distribution to estimate the
standard error of the mean.
\begin{enumerate}
\item Draw 1000 normally distributed random number and calculate the
mean, the standard deviation and the standard error
mean, the standard deviation, and the standard error
($\sigma/\sqrt{n}$).
\item Resample the data 1000 times (draw and replace) and calculate
\item Resample the data 1000 times (randomly draw and replace) and calculate
the mean of each bootstrapped sample.
\item Plot a histogram of the respective distribution and calculate its mean and
standard deviation. Compare with the
@ -182,16 +183,16 @@ statistical significance (figure\,\ref{permutecorrelationfig}).
Estimate the statistical significance of a correlation coefficient.
\begin{enumerate}
\item Create pairs of $(x_i, y_i)$ values. Randomly choose $x$-values
and calculate the respective $y$-values according to $y=0.2 \cdot x$
to which you add a random value drawn from a normal distribution.
and calculate the respective $y$-values according to $y_i =0.2 \cdot x_i + u_i$
where $u_i$ is a random number drawn from a normal distribution.
\item Calculate the correlation coefficient.
\item Generate the distribution according to the null hypothesis by
generating uncorrelated pairs. For this permute $x$- and $y$-values
(\matlabfun{randperm()}) 1000 times and calculate for each
\matlabfun{randperm()} 1000 times and calculate for each
permutation the correlation coefficient.
\item From the resulting null hypothesis distribution the 95\,\%
percentile and compare it with the correlation coefficient
calculated for the original data.
\item Read out the 95\,\% percentile from the resulting null
hypothesis distribution and compare it with the correlation
coefficient calculated for the original data.
\end{enumerate}
\end{exercise}

View File

@ -5,7 +5,7 @@
\author{{\LARGE Jan Grewe \& Jan Benda}\\[5ex]Abteilung Neuroethologie\\[2ex]%
\includegraphics[width=0.3\textwidth]{UT_WBMW_Rot_RGB}\vspace{3ex}}
\date{WS 2018/2019\\\vfill%
\date{WS 2019/2020\\\vfill%
\centerline{\includegraphics[width=0.7\textwidth]{announcements/correlationcartoon}%
\rotatebox{90}{\footnotesize\url{www.xkcd.com}}}}