new command \endeterm for english terms that also make an entry into the german index - not working yet
This commit is contained in:
@@ -33,10 +33,11 @@ population. Rather, we draw samples (\enterm{simple random sample}
|
||||
then estimate a statistical measure of interest (e.g. the average
|
||||
length of the pickles) within this sample and hope that it is a good
|
||||
approximation of the unknown and immeasurable true average length of
|
||||
the population (\determ{Populationsparameter}). We apply statistical
|
||||
methods to find out how precise this approximation is.
|
||||
the population (\endeterm{Populationsparameter}{population
|
||||
parameter}). We apply statistical methods to find out how precise
|
||||
this approximation is.
|
||||
|
||||
If we could draw a large number of \enterm{simple random samples} we
|
||||
If we could draw a large number of simple random samples we
|
||||
could calculate the statistical measure of interest for each sample
|
||||
and estimate its probability distribution using a histogram. This
|
||||
distribution is called the \enterm{sampling distribution}
|
||||
@@ -69,17 +70,18 @@ error of the mean which is the standard deviation of the sampling
|
||||
distribution of average values around the true mean of the population
|
||||
(\subfigref{bootstrapsamplingdistributionfig}{b}).
|
||||
|
||||
Alternatively, we can use ``bootstrapping'' to generate new samples
|
||||
from one set of measurements (resampling). From these bootstrapped
|
||||
samples we compute the desired statistical measure and estimate their
|
||||
distribution (\enterm{bootstrap distribution},
|
||||
\subfigref{bootstrapsamplingdistributionfig}{c}). Interestingly, this
|
||||
distribution is very similar to the sampling distribution regarding
|
||||
its width. The only difference is that the bootstrapped values are
|
||||
distributed around the measure of the original sample and not the one
|
||||
of the statistical population. We can use the bootstrap distribution
|
||||
to draw conclusion regarding the precision of our estimation (e.g.
|
||||
standard errors and confidence intervals).
|
||||
Alternatively, we can use \enterm{bootstrapping}
|
||||
(\determ{Bootstrap-Verfahren}) to generate new samples from one set of
|
||||
measurements (\endeterm{Resampling}{resampling}). From these
|
||||
bootstrapped samples we compute the desired statistical measure and
|
||||
estimate their distribution (\endeterm{Bootstrapverteilung}{bootstrap
|
||||
distribution}, \subfigref{bootstrapsamplingdistributionfig}{c}).
|
||||
Interestingly, this distribution is very similar to the sampling
|
||||
distribution regarding its width. The only difference is that the
|
||||
bootstrapped values are distributed around the measure of the original
|
||||
sample and not the one of the statistical population. We can use the
|
||||
bootstrap distribution to draw conclusion regarding the precision of
|
||||
our estimation (e.g. standard errors and confidence intervals).
|
||||
|
||||
Bootstrapping methods create bootstrapped samples from a SRS by
|
||||
resampling. The bootstrapped samples are used to estimate the sampling
|
||||
@@ -93,11 +95,12 @@ data set using the \code{randi()} function.
|
||||
|
||||
\section{Bootstrap of the standard error}
|
||||
|
||||
Bootstrapping can be nicely illustrated at the example of the standard
|
||||
error of the mean. The arithmetic mean is calculated for a simple
|
||||
random sample. The standard error of the mean is the standard
|
||||
deviation of the expected distribution of mean values around the mean
|
||||
of the statistical population.
|
||||
Bootstrapping can be nicely illustrated at the example of the
|
||||
\enterm{standard error} of the mean (\determ{Standardfehler}). The
|
||||
arithmetic mean is calculated for a simple random sample. The standard
|
||||
error of the mean is the standard deviation of the expected
|
||||
distribution of mean values around the mean of the statistical
|
||||
population.
|
||||
|
||||
\begin{figure}[tp]
|
||||
\includegraphics[width=1\textwidth]{bootstrapsem}
|
||||
@@ -135,9 +138,10 @@ distribution is the standard error of the mean.
|
||||
|
||||
|
||||
\section{Permutation tests}
|
||||
Statistical tests ask for the probability of a measured value
|
||||
to originate from a null hypothesis. Is this probability smaller than
|
||||
the desired significance level, the null hypothesis may be rejected.
|
||||
Statistical tests ask for the probability of a measured value to
|
||||
originate from a null hypothesis. Is this probability smaller than the
|
||||
desired \endeterm{Signifikanz}{significance level}, the
|
||||
\endeterm{Nullhypothese}{null hypothesis} may be rejected.
|
||||
|
||||
Traditionally, such probabilities are taken from theoretical
|
||||
distributions which are based on assumptions about the data. Thus the
|
||||
@@ -161,22 +165,25 @@ while we conserve all other statistical properties of the data.
|
||||
statistically significant.}
|
||||
\end{figure}
|
||||
|
||||
A good example for the application of a permutaion test is the
|
||||
statistical assessment of correlations. Given are measured pairs of
|
||||
data points $(x_i, y_i)$. By calculating the correlation coefficient
|
||||
we can quantify how strongly $y$ depends on $x$. The correlation
|
||||
coefficient alone, however, does not tell whether the correlation is
|
||||
significantly different from a random correlation. The null hypothesis
|
||||
for such a situation is that $y$ does not depend on $x$. In
|
||||
order to perform a permutation test, we need to destroy the
|
||||
correlation by permuting the $(x_i, y_i)$ pairs, i.e. we rearrange the
|
||||
$x_i$ and $y_i$ values in a random fashion. Generating many sets of
|
||||
random pairs and computing the resulting correlation coefficients
|
||||
yields a distribution of correlation coefficients that result
|
||||
randomly from uncorrelated data. By comparing the actually measured
|
||||
correlation coefficient with this distribution we can directly assess
|
||||
the significance of the correlation
|
||||
(figure\,\ref{permutecorrelationfig}).
|
||||
A good example for the application of a
|
||||
\endeterm{Permutationstest}{permutaion test} is the statistical
|
||||
assessment of \endeterm[correlation]{Korrelation}{correlations}. Given
|
||||
are measured pairs of data points $(x_i, y_i)$. By calculating the
|
||||
\endeterm[correlation!correlation
|
||||
coefficient]{Korrelation!Korrelationskoeffizient}{correlation
|
||||
coefficient} we can quantify how strongly $y$ depends on $x$. The
|
||||
correlation coefficient alone, however, does not tell whether the
|
||||
correlation is significantly different from a random correlation. The
|
||||
\endeterm[]{Nullhypothese}{null hypothesis} for such a situation is that
|
||||
$y$ does not depend on $x$. In order to perform a permutation test, we
|
||||
need to destroy the correlation by permuting the $(x_i, y_i)$ pairs,
|
||||
i.e. we rearrange the $x_i$ and $y_i$ values in a random
|
||||
fashion. Generating many sets of random pairs and computing the
|
||||
resulting correlation coefficients yields a distribution of
|
||||
correlation coefficients that result randomly from uncorrelated
|
||||
data. By comparing the actually measured correlation coefficient with
|
||||
this distribution we can directly assess the significance of the
|
||||
correlation (figure\,\ref{permutecorrelationfig}).
|
||||
|
||||
\begin{exercise}{correlationsignificance.m}{correlationsignificance.out}
|
||||
Estimate the statistical significance of a correlation coefficient.
|
||||
|
||||
Reference in New Issue
Block a user