scientificComputing/projects/project_qvalues/qvalues.tex

\documentclass[a4paper,12pt,pdftex]{exam}

\newcommand{\ptitle}{q-values}
\input{../header.tex}
\firstpagefooter{Supervisor: Jan Benda}{phone: 29 74573}%
{email: jan.benda@uni-tuebingen.de}

\begin{document}

\input{../instructions.tex}

%%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%%

\begin{questions}
  \question The p-value corresponds to the probability
  $$P(\mbox{result seems significant}| H_0 \mbox{is true}).$$
  This means that if your significance threshold is $\alpha=0.05$ and
  you accept all test with $p \le \alpha$ as significant, then $5\%$
  of all cases in which $H_0$ was true (there was no effect) your test
  will appear significant (false positive).

  The problem with that is that you do not know for how many of the
  tests $H_0$ is actually true. What you really would like to know is:
  From all those tests that came out significant ($p\le\alpha$) how
  many of them are false positives? This probability corresponds to
  $$P(H_0 \mbox{is true}|\mbox{result seems significant})$$ and is
  called {\em false discovery rate}. In general you cannot compute
  it. However, if you have many p-values, then you can actually
  estimate it. The corresponding ``p-value'' for the false discovery
  rate is called ``q-value''.

  In the paper

  {\em Storey, J. D., \& Tibshirani, R. (2003). Statistical
    significance for genomewide studies. Proceedings of the National
    Academy of Sciences of the United States of America, 100,
    9440–9445. doi:10.1073/pnas.1530509100}

  you can find an algorithm how to compute q-values from p-values.

  The attached data file {\tt p\_values.dat} contains p-values from
  test of several neurons whether they respond to a certain stimulus
  condition or not.

  \begin{parts}
    \part Plot a histogram of the p-values.
    \part Read and understand the paper by Storey and
    Tibshirani. Visualize their method at your histogram.
    \part Implement their method and convert each p-value to a
    q-value.
    \part From running the script, estimate the proportion of neurons
    that show a true effect (i.e. $P(H_A)$).
  \end{parts}

\end{questions}


\end{document}