added bootstrap of correlation coefficient
This commit is contained in:
parent
5d3f4453d5
commit
9c462f6070
37
bootstrap/exercises/correlationbootstrap.m
Normal file
37
bootstrap/exercises/correlationbootstrap.m
Normal file
@ -0,0 +1,37 @@
|
||||
%% (a) bootstrap:
|
||||
nperm = 1000;
|
||||
rb = zeros(nperm,1);
|
||||
for i=1:nperm
|
||||
% indices for resampling the data:
|
||||
inx = randi(length(x), length(x), 1);
|
||||
% resampled data pairs:
|
||||
xb=x(inx);
|
||||
yb=y(inx);
|
||||
rb(i) = corr(xb, yb);
|
||||
end
|
||||
|
||||
%% (b) pdf of the correlation coefficients:
|
||||
[hb,bb] = hist(rb, 20 );
|
||||
hb = hb/sum(hb)/(bb(2)-bb(1)); % normalization
|
||||
|
||||
%% (c) significance:
|
||||
rbq = quantile(rb, 0.05);
|
||||
fprintf('correlation coefficient at 5%% significance = %.2f\n', rbq );
|
||||
if rbq > 0.0
|
||||
fprintf('--> correlation r=%.2f is significant\n', rd);
|
||||
else
|
||||
fprintf('--> r=%.2f is not a significant correlation\n', rd);
|
||||
end
|
||||
|
||||
%% plot:
|
||||
hold on;
|
||||
bar(b, h, 'facecolor', [0.5 0.5 0.5]);
|
||||
bar(bb, hb, 'facecolor', 'b');
|
||||
bar(bb(bb<=rbq), hb(bb<=rbq), 'facecolor', 'r');
|
||||
plot( [rd rd], [0 4], 'r', 'linewidth', 2 );
|
||||
xlim([-0.25 0.75])
|
||||
xlabel('Correlation coefficient');
|
||||
ylabel('Probability density');
|
||||
hold off;
|
||||
|
||||
savefigpdf( gcf, 'correlationbootstrap.pdf', 12, 6 );
|
BIN
bootstrap/exercises/correlationbootstrap.pdf
Normal file
BIN
bootstrap/exercises/correlationbootstrap.pdf
Normal file
Binary file not shown.
Binary file not shown.
@ -148,32 +148,56 @@ distributed?
|
||||
|
||||
|
||||
\continue
|
||||
\question \qt{Permutation test}
|
||||
\question \qt{Permutation test} \label{permutationtest}
|
||||
We want to compute the significance of a correlation by means of a permutation test.
|
||||
\begin{parts}
|
||||
\part Generate 1000 correlated pairs $x$, $y$ of random numbers according to:
|
||||
\part \label{permutationtestdata} Generate 1000 correlated pairs
|
||||
$x$, $y$ of random numbers according to:
|
||||
\begin{verbatim}
|
||||
n = 1000
|
||||
a = 0.2;
|
||||
x = randn(n, 1);
|
||||
y = randn(n, 1) + a*x;
|
||||
\end{verbatim}
|
||||
\part Generate a scatter plot of the two variables.
|
||||
\part Why is $y$ correlated with $x$?
|
||||
\part Compute the correlation coefficient between $x$ and $y$.
|
||||
\part What do you need to do in order to destroy the correlations between the $x$-$y$ pairs?
|
||||
\part Do exactly this 1000 times and compute each time the correlation coefficient.
|
||||
\part Compute the probability density of these correlation coefficients.
|
||||
\part Is the correlation of the original data set significant?
|
||||
\part What does significance of the correlation mean?
|
||||
\part Vary the sample size \code{n} and compute in the same way the
|
||||
significance of the correlation.
|
||||
\part Generate a scatter plot of the two variables.
|
||||
\part Why is $y$ correlated with $x$?
|
||||
\part Compute the correlation coefficient between $x$ and $y$.
|
||||
\part What do you need to do in order to destroy the correlations between the $x$-$y$ pairs?
|
||||
\part Do exactly this 1000 times and compute each time the correlation coefficient.
|
||||
\part Compute and plot the probability density of these correlation
|
||||
coefficients.
|
||||
\part Is the correlation of the original data set significant?
|
||||
\part What does significance of the correlation mean?
|
||||
\part Vary the sample size \code{n} and compute in the same way the
|
||||
significance of the correlation.
|
||||
\end{parts}
|
||||
\begin{solution}
|
||||
\lstinputlisting{correlationsignificance.m}
|
||||
\includegraphics[width=1\textwidth]{correlationsignificance}
|
||||
\end{solution}
|
||||
|
||||
\question \qt{Bootstrap of the correlation coefficient}
|
||||
The permutation test generates the distribution of the null hypothesis
|
||||
of uncorrelated data and we check whether the correlation coefficient
|
||||
of the data differs significantly from this
|
||||
distribution. Alternatively we can bootstrap the data while keeping
|
||||
the pairs and determine the confidence interval of the correlation
|
||||
coefficient of the data. If this differs significantly from a
|
||||
correlation coefficient of zero we can conclude that the correlation
|
||||
coefficient of the data quantifies indeed a correlated data.
|
||||
|
||||
We take the same data set that we have generated in exercise
|
||||
\ref{permutationtest} (\ref{permutationtestdata}).
|
||||
\begin{parts}
|
||||
\part Bootstrap 1000 times the correlation coefficient from the data.
|
||||
\part Compute and plot the probability density of these correlation
|
||||
coefficients.
|
||||
\part Is the correlation of the original data set significant?
|
||||
\end{parts}
|
||||
\begin{solution}
|
||||
\lstinputlisting{correlationbootstrap.m}
|
||||
\includegraphics[width=1\textwidth]{correlationbootstrap}
|
||||
\end{solution}
|
||||
|
||||
\end{questions}
|
||||
|
||||
|
Reference in New Issue
Block a user