From eef9e5f87deacb7549d03cf43ce8269640f17b4e Mon Sep 17 00:00:00 2001 From: Jan Grewe Date: Sat, 12 Oct 2019 12:01:18 +0200 Subject: [PATCH] [Bootstrap] language fixes --- bootstrap/lecture/bootstrap.tex | 41 +++++++++++++++++---------------- header.tex | 2 +- 2 files changed, 22 insertions(+), 21 deletions(-) diff --git a/bootstrap/lecture/bootstrap.tex b/bootstrap/lecture/bootstrap.tex index e4654fc..557f122 100644 --- a/bootstrap/lecture/bootstrap.tex +++ b/bootstrap/lecture/bootstrap.tex @@ -31,15 +31,15 @@ average length of all pickles (\figref{statisticalpopulationfig}). But we cannot measure the lengths of all pickles in the statistical population. Rather, we draw samples (simple random sample \enterm[SRS|see{simple random sample}]{SRS}, in German: -\determ{Stichprobe}). We then estimate a statistical measures -(e.g. the average length of the pickles) within in this sample and +\determ{Stichprobe}). We then estimate a statistical measure of interest +(e.g. the average length of the pickles) within this sample and hope that it is a good approximation of the unknown and immeasurable real average length of the statistical population (in German aka \determ{Populationsparameter}). We apply statistical methods to find -out how good this approximation is. +out how precise this approximation is. If we could draw a large number of \textit{simple random samples} we could -estimate the statistical measure of interest for each sample and +calculate the statistical measure of interest for each sample and estimate the probability distribution using a histogram. This distribution is called the \enterm{sampling distribution} (German: \determ{Stichprobenverteilung}, @@ -85,16 +85,17 @@ of the statistical population. We can use the bootstrap distribution to draw conclusion regarding the precision of our estimation (e.g. standard errors and confidence intervals). -Bootstrapping method create new SRS by resampling to estimate the -sampling distribution of a statistical measure. The bootstrapped -samples have the same size as the original sample and are created by -sampling with replacement, that is, each value of the original sample -can occur once, multiple time, or not at all in a bootstrapped sample. +Bootstrapping methods create bootstrapped samples from a SRS by +resampling. The bootstrapped samples are used to estimate the sampling +distribution of a statistical measure. The bootstrapped samples have +the same size as the original sample and are created by randomly drawing with +replacement, that is, each value of the original sample can occur +once, multiple time, or not at all in a bootstrapped sample. \section{Bootstrap of the standard error} -Bootstrapping can be nicely illustrated at the example the standard +Bootstrapping can be nicely illustrated at the example of the standard error of the mean. The arithmetic mean is calculated for a simple random sample. The standard error of the mean is the standard deviation of the expected distribution of mean values around the mean @@ -120,13 +121,13 @@ distribution is the standard error of the mean. \pagebreak[4] \begin{exercise}{bootstrapsem.m}{bootstrapsem.out} Create the distribution of mean values from bootstrapped samples - resampled form a single SRS. Use this distribution to estimate the + resampled from a single SRS. Use this distribution to estimate the standard error of the mean. \begin{enumerate} \item Draw 1000 normally distributed random number and calculate the - mean, the standard deviation and the standard error + mean, the standard deviation, and the standard error ($\sigma/\sqrt{n}$). - \item Resample the data 1000 times (draw and replace) and calculate + \item Resample the data 1000 times (randomly draw and replace) and calculate the mean of each bootstrapped sample. \item Plot a histogram of the respective distribution and calculate its mean and standard deviation. Compare with the @@ -135,7 +136,7 @@ distribution is the standard error of the mean. \end{exercise} -\section{Permutationtests} +\section{Permutation tests} Statistical tests ask for the probability that a measured value originates from the null hypothesis. Is this probability smaller than the desired significance level, the null hypothesis may be rejected. @@ -182,16 +183,16 @@ statistical significance (figure\,\ref{permutecorrelationfig}). Estimate the statistical significance of a correlation coefficient. \begin{enumerate} \item Create pairs of $(x_i, y_i)$ values. Randomly choose $x$-values - and calculate the respective $y$-values according to $y=0.2 \cdot x$ - to which you add a random value drawn from a normal distribution. + and calculate the respective $y$-values according to $y_i =0.2 \cdot x_i + u_i$ + where $u_i$ is a random number drawn from a normal distribution. \item Calculate the correlation coefficient. \item Generate the distribution according to the null hypothesis by generating uncorrelated pairs. For this permute $x$- and $y$-values - (\matlabfun{randperm()}) 1000 times and calculate for each + \matlabfun{randperm()} 1000 times and calculate for each permutation the correlation coefficient. -\item From the resulting null hypothesis distribution the 95\,\% - percentile and compare it with the correlation coefficient - calculated for the original data. +\item Read out the 95\,\% percentile from the resulting null + hypothesis distribution and compare it with the correlation + coefficient calculated for the original data. \end{enumerate} \end{exercise} diff --git a/header.tex b/header.tex index 38fe4e2..567b1fc 100644 --- a/header.tex +++ b/header.tex @@ -5,7 +5,7 @@ \author{{\LARGE Jan Grewe \& Jan Benda}\\[5ex]Abteilung Neuroethologie\\[2ex]% \includegraphics[width=0.3\textwidth]{UT_WBMW_Rot_RGB}\vspace{3ex}} -\date{WS 2018/2019\\\vfill% +\date{WS 2019/2020\\\vfill% \centerline{\includegraphics[width=0.7\textwidth]{announcements/correlationcartoon}% \rotatebox{90}{\footnotesize\url{www.xkcd.com}}}}