q-values project

2014-10-31 14:07:52 +01:00 · 2014-10-31 14:07:52 +01:00 · 61cb4445c7
commit 61cb4445c7
parent 2171efaee6
3 changed files with 1443 additions and 0 deletions
--- a/projects/project_q-values/Makefile
+++ b/projects/project_q-values/Makefile
@ -0,0 +1,10 @@
+latex:
+	pdflatex *.tex > /dev/null
+	pdflatex *.tex > /dev/null
+
+clean:
+	rm -rf *.log *.aux *.zip *.out auto
+	rm -f `basename *.tex .tex`.pdf
+
+zip: latex
+	zip `basename *.tex .tex`.zip *.pdf *.dat *.mat
--- a/projects/project_q-values/p_values.dat
+++ b/projects/project_q-values/p_values.dat
--- a/projects/project_q-values/qvalues.tex
+++ b/projects/project_q-values/qvalues.tex
@ -0,0 +1,81 @@
+\documentclass[addpoints,10pt]{exam}
+\usepackage{url}
+\usepackage{color}
+\usepackage{hyperref}
+
+\pagestyle{headandfoot}
+\runningheadrule
+\firstpageheadrule
+\firstpageheader{Scientific Computing}{Project Assignment}{11/05/2014
+  -- 11/06/2014}
+%\runningheader{Homework 01}{Page \thepage\ of \numpages}{23. October 2014}
+\firstpagefooter{}{}{}
+\runningfooter{}{}{}
+\pointsinmargin
+\bracketedpoints
+
+%\printanswers
+%\shadedsolutions
+
+
+\begin{document}
+%%%%%%%%%%%%%%%%%%%%% Submission instructions %%%%%%%%%%%%%%%%%%%%%%%%%
+\sffamily
+% \begin{flushright}
+% \gradetable[h][questions]
+% \end{flushright}
+
+\begin{center}
+  \input{../disclaimer.tex}
+\end{center}
+
+%%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%%
+
+\begin{questions}
+  \question The p-value corresponds to the probability
+  $$P(\mbox{result seems significant}| H_0 \mbox{is true}).$$
+  This means that if your significance threshold is $\alpha=0.05$ and
+  you accept all test with $p \le \alpha$ as significant, then $5\%$
+  of all cases in which $H_0$ was true (there was no effect) your test
+  will appear significant (false positive). 
+
+  The problem with that is that you do not know for how many of the
+  tests $H_0$ is actually true. What you really would like to know is:
+  From all those tests that came out significant ($p\le\alpha$) how
+  many of them are false positives? This probability corresponds to
+  $$P(H_0 \mbox{is true}|\mbox{result seems significant})$$ and is
+  called {\em false discovery rate}. In general you cannot compute
+  it. However, if you have many p-values, then you can actually
+  estimate it. The corresponding ``p-value'' for the false discovery
+  rate is called ``q-value''. 
+
+  In the paper 
+
+  {\em Storey, J. D., \& Tibshirani, R. (2003). Statistical
+    significance for genomewide studies. Proceedings of the National
+    Academy of Sciences of the United States of America, 100,
+    9440–9445. doi:10.1073/pnas.1530509100}
+
+  you can find an algorithm how to compute q-values from p-values. 
+
+  The attached data file {\tt p\_values.dat} contains p-values from
+  test of several neurons whether they respond to a certain stimulus
+  condition or not. 
+
+  \begin{parts}
+    \part Plot a histogram of the p-values.
+    \part Read and understand the paper by Storey and
+    Tibshirani. Visualize their method at your histogram. 
+    \part Implement their method and convert each p-value to a
+    q-value. 
+    \part From running the script, estimate the proportion of neurons
+    that show a true effect (i.e. $P(H_A)$).
+  \end{parts}
+  
+\end{questions}
+
+
+
+
+
+\end{document}