[bootstrap] generalized intro
This commit is contained in:
parent
3dd0660b21
commit
b76b2d35cc
@ -23,7 +23,14 @@ This chapter easily covers two lectures:
|
|||||||
\item 1. Bootstrapping with a proper introduction of of confidence intervals
|
\item 1. Bootstrapping with a proper introduction of of confidence intervals
|
||||||
\item 2. Permutation test with a proper introduction of statistical tests (distribution of nullhypothesis, significance, power, etc.)
|
\item 2. Permutation test with a proper introduction of statistical tests (distribution of nullhypothesis, significance, power, etc.)
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
ToDo:
|
||||||
Add jacknife methods to bootstrapping
|
\begin{itemize}
|
||||||
|
\item Add jacknife methods to bootstrapping
|
||||||
|
\item Add discussion of confidence intervals to descriptive statistics chapter
|
||||||
|
\item Have a separate chapter on statistical tests before. What is the
|
||||||
|
essence of a statistical test (null hypothesis distribution), power
|
||||||
|
analysis, and a few examples of existing functions for statistical
|
||||||
|
tests.
|
||||||
|
\end{itemize}
|
||||||
|
|
||||||
\end{document}
|
\end{document}
|
||||||
|
@ -5,20 +5,24 @@
|
|||||||
\exercisechapter{Resampling methods}
|
\exercisechapter{Resampling methods}
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
\entermde{Resampling methoden}{Resampling methods} are applied to
|
||||||
\section{Bootstrapping}
|
generate distributions of statistical measures via resampling of
|
||||||
|
existing samples. Resampling offers several advantages:
|
||||||
Bootstrapping methods are applied to create distributions of
|
|
||||||
statistical measures via resampling of a sample. Bootstrapping offers several
|
|
||||||
advantages:
|
|
||||||
\begin{itemize}
|
\begin{itemize}
|
||||||
\item Fewer assumptions (e.g. a measured sample does not need to be
|
\item Fewer assumptions (e.g. a measured sample does not need to be
|
||||||
normally distributed).
|
normally distributed).
|
||||||
\item Increased precision as compared to classical methods. %such as?
|
\item Increased precision as compared to classical methods. %such as?
|
||||||
\item General applicability: the bootstrapping methods are very
|
\item General applicability: the resampling methods are very
|
||||||
similar for different statistics and there is no need to specialize
|
similar for different statistics and there is no need to specialize
|
||||||
the method to specific statistic measures.
|
the method to specific statistic measures.
|
||||||
\end{itemize}
|
\end{itemize}
|
||||||
|
Resampling methods can be used for both estimating the precision of
|
||||||
|
estimated statistics (e.g. standard error of the mean, confidence
|
||||||
|
intervals) and testing for significane.
|
||||||
|
|
||||||
|
|
||||||
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
|
\section{Bootstrapping}
|
||||||
|
|
||||||
\begin{figure}[tp]
|
\begin{figure}[tp]
|
||||||
\includegraphics[width=0.8\textwidth]{2012-10-29_16-26-05_771}\\[2ex]
|
\includegraphics[width=0.8\textwidth]{2012-10-29_16-26-05_771}\\[2ex]
|
||||||
@ -88,20 +92,20 @@ of the statistical population. We can use the bootstrap distribution
|
|||||||
to draw conclusion regarding the precision of our estimation (e.g.
|
to draw conclusion regarding the precision of our estimation (e.g.
|
||||||
standard errors and confidence intervals).
|
standard errors and confidence intervals).
|
||||||
|
|
||||||
Bootstrapping methods create bootstrapped samples from a SRS by
|
Bootstrapping methods generate bootstrapped samples from a SRS by
|
||||||
resampling. The bootstrapped samples are used to estimate the sampling
|
resampling. The bootstrapped samples are used to estimate the sampling
|
||||||
distribution of a statistical measure. The bootstrapped samples have
|
distribution of a statistical measure. The bootstrapped samples have
|
||||||
the same size as the original sample and are created by randomly
|
the same size as the original sample and are generated by randomly
|
||||||
drawing with replacement. That is, each value of the original sample
|
drawing with replacement. That is, each value of the original sample
|
||||||
can occur once, multiple time, or not at all in a bootstrapped
|
can occur once, multiple times, or not at all in a bootstrapped
|
||||||
sample. This can be implemented by generating random indices into the
|
sample. This can be implemented by generating random indices into the
|
||||||
data set using the \code{randi()} function.
|
data set using the \code{randi()} function.
|
||||||
|
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\subsection{Bootstrap of the standard error}
|
\subsection{Bootstrap the standard error}
|
||||||
|
|
||||||
Bootstrapping can be nicely illustrated at the example of the
|
Bootstrapping can be nicely illustrated on the example of the
|
||||||
\enterm{standard error} of the mean (\determ{Standardfehler}). The
|
\enterm{standard error} of the mean (\determ{Standardfehler}). The
|
||||||
arithmetic mean is calculated for a simple random sample. The standard
|
arithmetic mean is calculated for a simple random sample. The standard
|
||||||
error of the mean is the standard deviation of the expected
|
error of the mean is the standard deviation of the expected
|
||||||
@ -121,9 +125,9 @@ population.
|
|||||||
the standard error of the mean.}
|
the standard error of the mean.}
|
||||||
\end{figure}
|
\end{figure}
|
||||||
|
|
||||||
Via bootstrapping we create a distribution of the mean values
|
Via bootstrapping we generate a distribution of mean values
|
||||||
(\figref{bootstrapsemfig}) and the standard deviation of this
|
(\figref{bootstrapsemfig}) and the standard deviation of this
|
||||||
distribution is the standard error of the mean.
|
distribution is the standard error of the sample mean.
|
||||||
|
|
||||||
\begin{exercise}{bootstrapsem.m}{bootstrapsem.out}
|
\begin{exercise}{bootstrapsem.m}{bootstrapsem.out}
|
||||||
Create the distribution of mean values from bootstrapped samples
|
Create the distribution of mean values from bootstrapped samples
|
||||||
@ -148,17 +152,18 @@ distribution is the standard error of the mean.
|
|||||||
Statistical tests ask for the probability of a measured value to
|
Statistical tests ask for the probability of a measured value to
|
||||||
originate from a null hypothesis. Is this probability smaller than the
|
originate from a null hypothesis. Is this probability smaller than the
|
||||||
desired \entermde{Signifikanz}{significance level}, the
|
desired \entermde{Signifikanz}{significance level}, the
|
||||||
\entermde{Nullhypothese}{null hypothesis} may be rejected.
|
\entermde{Nullhypothese}{null hypothesis} can be rejected.
|
||||||
|
|
||||||
Traditionally, such probabilities are taken from theoretical
|
Traditionally, such probabilities are taken from theoretical
|
||||||
distributions which are based on some assumptions about the data. For
|
distributions which have been derived based on some assumptions about
|
||||||
example, the data should be normally distributed. Given some data one
|
the data. For example, the data should be normally distributed. Given
|
||||||
has to find an appropriate test that matches the properties of the
|
some data one has to find an appropriate test that matches the
|
||||||
data. An alternative approach is to calculate the probability density
|
properties of the data. An alternative approach is to calculate the
|
||||||
of the null hypothesis directly from the data themselves. To do so, we
|
probability density of the null hypothesis directly from the data
|
||||||
need to resample the data according to the null hypothesis from the
|
themselves. To do so, we need to resample the data according to the
|
||||||
SRS. By such permutation operations we destroy the feature of interest
|
null hypothesis from the SRS. By such permutation operations we
|
||||||
while conserving all other statistical properties of the data.
|
destroy the feature of interest while conserving all other statistical
|
||||||
|
properties of the data.
|
||||||
|
|
||||||
|
|
||||||
\subsection{Significance of a difference in the mean}
|
\subsection{Significance of a difference in the mean}
|
||||||
|
Reference in New Issue
Block a user