This repository has been archived on 2021-05-17. You can view files and clone it, but cannot push or open issues or pull requests.
scientificComputing/projects/project_q-values/qvalues.tex
2014-11-02 13:35:52 +01:00

82 lines
2.5 KiB
TeX
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

\documentclass[addpoints,10pt]{exam}
\usepackage{url}
\usepackage{color}
\usepackage{hyperref}
\pagestyle{headandfoot}
\runningheadrule
\firstpageheadrule
\firstpageheader{Scientific Computing}{Project Assignment}{11/05/2014
-- 11/06/2014}
%\runningheader{Homework 01}{Page \thepage\ of \numpages}{23. October 2014}
\firstpagefooter{}{}{}
\runningfooter{}{}{}
\pointsinmargin
\bracketedpoints
%\printanswers
%\shadedsolutions
\begin{document}
%%%%%%%%%%%%%%%%%%%%% Submission instructions %%%%%%%%%%%%%%%%%%%%%%%%%
\sffamily
% \begin{flushright}
% \gradetable[h][questions]
% \end{flushright}
\begin{center}
\input{../disclaimer.tex}
\end{center}
%%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%%
\begin{questions}
\question The p-value corresponds to the probability
$$P(\mbox{result seems significant}| H_0 \mbox{is true}).$$
This means that if your significance threshold is $\alpha=0.05$ and
you accept all test with $p \le \alpha$ as significant, then $5\%$
of all cases in which $H_0$ was true (there was no effect) your test
will appear significant (false positive).
The problem with that is that you do not know for how many of the
tests $H_0$ is actually true. What you really would like to know is:
From all those tests that came out significant ($p\le\alpha$) how
many of them are false positives? This probability corresponds to
$$P(H_0 \mbox{is true}|\mbox{result seems significant})$$ and is
called {\em false discovery rate}. In general you cannot compute
it. However, if you have many p-values, then you can actually
estimate it. The corresponding ``p-value'' for the false discovery
rate is called ``q-value''.
In the paper
{\em Storey, J. D., \& Tibshirani, R. (2003). Statistical
significance for genomewide studies. Proceedings of the National
Academy of Sciences of the United States of America, 100,
94409445. doi:10.1073/pnas.1530509100}
you can find an algorithm how to compute q-values from p-values.
The attached data file {\tt p\_values.dat} contains p-values from
test of several neurons whether they respond to a certain stimulus
condition or not.
\begin{parts}
\part Plot a histogram of the p-values.
\part Read and understand the paper by Storey and
Tibshirani. Visualize their method at your histogram.
\part Implement their method and convert each p-value to a
q-value.
\part From running the script, estimate the proportion of neurons
that show a true effect (i.e. $P(H_A)$).
\end{parts}
\end{questions}
\end{document}