new command \endeterm for english terms that also make an entry into the german index - not working yet
This commit is contained in:
parent
006fa998cc
commit
4d2bedd78c
@ -33,10 +33,11 @@ population. Rather, we draw samples (\enterm{simple random sample}
|
||||
then estimate a statistical measure of interest (e.g. the average
|
||||
length of the pickles) within this sample and hope that it is a good
|
||||
approximation of the unknown and immeasurable true average length of
|
||||
the population (\determ{Populationsparameter}). We apply statistical
|
||||
methods to find out how precise this approximation is.
|
||||
the population (\endeterm{Populationsparameter}{population
|
||||
parameter}). We apply statistical methods to find out how precise
|
||||
this approximation is.
|
||||
|
||||
If we could draw a large number of \enterm{simple random samples} we
|
||||
If we could draw a large number of simple random samples we
|
||||
could calculate the statistical measure of interest for each sample
|
||||
and estimate its probability distribution using a histogram. This
|
||||
distribution is called the \enterm{sampling distribution}
|
||||
@ -69,17 +70,18 @@ error of the mean which is the standard deviation of the sampling
|
||||
distribution of average values around the true mean of the population
|
||||
(\subfigref{bootstrapsamplingdistributionfig}{b}).
|
||||
|
||||
Alternatively, we can use ``bootstrapping'' to generate new samples
|
||||
from one set of measurements (resampling). From these bootstrapped
|
||||
samples we compute the desired statistical measure and estimate their
|
||||
distribution (\enterm{bootstrap distribution},
|
||||
\subfigref{bootstrapsamplingdistributionfig}{c}). Interestingly, this
|
||||
distribution is very similar to the sampling distribution regarding
|
||||
its width. The only difference is that the bootstrapped values are
|
||||
distributed around the measure of the original sample and not the one
|
||||
of the statistical population. We can use the bootstrap distribution
|
||||
to draw conclusion regarding the precision of our estimation (e.g.
|
||||
standard errors and confidence intervals).
|
||||
Alternatively, we can use \enterm{bootstrapping}
|
||||
(\determ{Bootstrap-Verfahren}) to generate new samples from one set of
|
||||
measurements (\endeterm{Resampling}{resampling}). From these
|
||||
bootstrapped samples we compute the desired statistical measure and
|
||||
estimate their distribution (\endeterm{Bootstrapverteilung}{bootstrap
|
||||
distribution}, \subfigref{bootstrapsamplingdistributionfig}{c}).
|
||||
Interestingly, this distribution is very similar to the sampling
|
||||
distribution regarding its width. The only difference is that the
|
||||
bootstrapped values are distributed around the measure of the original
|
||||
sample and not the one of the statistical population. We can use the
|
||||
bootstrap distribution to draw conclusion regarding the precision of
|
||||
our estimation (e.g. standard errors and confidence intervals).
|
||||
|
||||
Bootstrapping methods create bootstrapped samples from a SRS by
|
||||
resampling. The bootstrapped samples are used to estimate the sampling
|
||||
@ -93,11 +95,12 @@ data set using the \code{randi()} function.
|
||||
|
||||
\section{Bootstrap of the standard error}
|
||||
|
||||
Bootstrapping can be nicely illustrated at the example of the standard
|
||||
error of the mean. The arithmetic mean is calculated for a simple
|
||||
random sample. The standard error of the mean is the standard
|
||||
deviation of the expected distribution of mean values around the mean
|
||||
of the statistical population.
|
||||
Bootstrapping can be nicely illustrated at the example of the
|
||||
\enterm{standard error} of the mean (\determ{Standardfehler}). The
|
||||
arithmetic mean is calculated for a simple random sample. The standard
|
||||
error of the mean is the standard deviation of the expected
|
||||
distribution of mean values around the mean of the statistical
|
||||
population.
|
||||
|
||||
\begin{figure}[tp]
|
||||
\includegraphics[width=1\textwidth]{bootstrapsem}
|
||||
@ -135,9 +138,10 @@ distribution is the standard error of the mean.
|
||||
|
||||
|
||||
\section{Permutation tests}
|
||||
Statistical tests ask for the probability of a measured value
|
||||
to originate from a null hypothesis. Is this probability smaller than
|
||||
the desired significance level, the null hypothesis may be rejected.
|
||||
Statistical tests ask for the probability of a measured value to
|
||||
originate from a null hypothesis. Is this probability smaller than the
|
||||
desired \endeterm{Signifikanz}{significance level}, the
|
||||
\endeterm{Nullhypothese}{null hypothesis} may be rejected.
|
||||
|
||||
Traditionally, such probabilities are taken from theoretical
|
||||
distributions which are based on assumptions about the data. Thus the
|
||||
@ -161,22 +165,25 @@ while we conserve all other statistical properties of the data.
|
||||
statistically significant.}
|
||||
\end{figure}
|
||||
|
||||
A good example for the application of a permutaion test is the
|
||||
statistical assessment of correlations. Given are measured pairs of
|
||||
data points $(x_i, y_i)$. By calculating the correlation coefficient
|
||||
we can quantify how strongly $y$ depends on $x$. The correlation
|
||||
coefficient alone, however, does not tell whether the correlation is
|
||||
significantly different from a random correlation. The null hypothesis
|
||||
for such a situation is that $y$ does not depend on $x$. In
|
||||
order to perform a permutation test, we need to destroy the
|
||||
correlation by permuting the $(x_i, y_i)$ pairs, i.e. we rearrange the
|
||||
$x_i$ and $y_i$ values in a random fashion. Generating many sets of
|
||||
random pairs and computing the resulting correlation coefficients
|
||||
yields a distribution of correlation coefficients that result
|
||||
randomly from uncorrelated data. By comparing the actually measured
|
||||
correlation coefficient with this distribution we can directly assess
|
||||
the significance of the correlation
|
||||
(figure\,\ref{permutecorrelationfig}).
|
||||
A good example for the application of a
|
||||
\endeterm{Permutationstest}{permutaion test} is the statistical
|
||||
assessment of \endeterm[correlation]{Korrelation}{correlations}. Given
|
||||
are measured pairs of data points $(x_i, y_i)$. By calculating the
|
||||
\endeterm[correlation!correlation
|
||||
coefficient]{Korrelation!Korrelationskoeffizient}{correlation
|
||||
coefficient} we can quantify how strongly $y$ depends on $x$. The
|
||||
correlation coefficient alone, however, does not tell whether the
|
||||
correlation is significantly different from a random correlation. The
|
||||
\endeterm[]{Nullhypothese}{null hypothesis} for such a situation is that
|
||||
$y$ does not depend on $x$. In order to perform a permutation test, we
|
||||
need to destroy the correlation by permuting the $(x_i, y_i)$ pairs,
|
||||
i.e. we rearrange the $x_i$ and $y_i$ values in a random
|
||||
fashion. Generating many sets of random pairs and computing the
|
||||
resulting correlation coefficients yields a distribution of
|
||||
correlation coefficients that result randomly from uncorrelated
|
||||
data. By comparing the actually measured correlation coefficient with
|
||||
this distribution we can directly assess the significance of the
|
||||
correlation (figure\,\ref{permutecorrelationfig}).
|
||||
|
||||
\begin{exercise}{correlationsignificance.m}{correlationsignificance.out}
|
||||
Estimate the statistical significance of a correlation coefficient.
|
||||
|
10
header.tex
10
header.tex
@ -212,9 +212,19 @@
|
||||
|
||||
%%%%% english, german, code and file terms: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||
\usepackage{ifthen}
|
||||
|
||||
% \enterm[english index entry]{<english term>}
|
||||
\newcommand{\enterm}[2][]{\tr{\textit{#2}}{``#2''}\ifthenelse{\equal{#1}{}}{\tr{\protect\sindex[term]{#2}}{\protect\sindex[enterm]{#2}}}{\tr{\protect\sindex[term]{#1}}{\protect\sindex[enterm]{#1}}}}
|
||||
|
||||
% \endeterm[english index entry]{<german index entry>}{<english term>}
|
||||
\newcommand{\endeterm}[3][]{\tr{\textit{#3}}{``#3''}\ifthenelse{\equal{#1}{}}{\tr{\protect\sindex[term]{#3}}{\protect\sindex[enterm]{#3}}}{\tr{\protect\sindex[term]{#1}}{\protect\sindex[enterm]{#1}}}\protect\sindex[determ]{#2}}
|
||||
|
||||
% \determ[index entry]{<german term>}
|
||||
\newcommand{\determ}[2][]{\tr{``#2''}{\textit{#2}}\ifthenelse{\equal{#1}{}}{\tr{\protect\sindex[determ]{#2}}{\protect\sindex[term]{#2}}}{\tr{\protect\sindex[determ]{#1}}{\protect\sindex[term]{#1}}}}
|
||||
|
||||
% \codeterm[index entry]{<code>}
|
||||
\newcommand{\codeterm}[2][]{\textit{#2}\ifthenelse{\equal{#1}{}}{\protect\sindex[term]{#2}}{\protect\sindex[term]{#1}}}
|
||||
|
||||
\newcommand{\file}[1]{\texttt{#1}}
|
||||
|
||||
% for escaping special characters into the index:
|
||||
|
@ -455,7 +455,7 @@ bivariate or multivariate data sets where we have pairs or tuples of
|
||||
data values (e.g. size and weight of elephants) we want to analyze
|
||||
dependencies between the variables.
|
||||
|
||||
The \enterm{correlation coefficient}
|
||||
The \enterm[correlation!correlation coefficient]{correlation coefficient}
|
||||
\begin{equation}
|
||||
\label{correlationcoefficient}
|
||||
r_{x,y} = \frac{Cov(x,y)}{\sigma_x \sigma_y} = \frac{\langle
|
||||
@ -465,7 +465,7 @@ The \enterm{correlation coefficient}
|
||||
\end{equation}
|
||||
quantifies linear relationships between two variables
|
||||
\matlabfun{corr()}. The correlation coefficient is the
|
||||
\determ{covariance} normalized by the standard deviations of the
|
||||
\enterm{covariance} normalized by the standard deviations of the
|
||||
single variables. Perfectly correlated variables result in a
|
||||
correlation coefficient of $+1$, anit-correlated or negatively
|
||||
correlated data in a correlation coefficient of $-1$ and un-correlated
|
||||
|
Reference in New Issue
Block a user