\documentclass[addpoints,10pt]{exam} \usepackage{url} \usepackage{color} \usepackage{hyperref} \pagestyle{headandfoot} \runningheadrule \firstpageheadrule \firstpageheader{Scientific Computing}{afternoon assignment day 02}{10/22/2014} %\runningheader{Homework 01}{Page \thepage\ of \numpages}{23. October 2014} \firstpagefooter{}{}{} \runningfooter{}{}{} \pointsinmargin \bracketedpoints %\printanswers \shadedsolutions \begin{document} %%%%%%%%%%%%%%%%%%%%% Submission instructions %%%%%%%%%%%%%%%%%%%%%%%%% \sffamily %%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%% \begin{questions} \question When the p-value is small, we reject the null hypothesis. For example, if you want to test whether two means are not equal, the null hypothesis is ``means are equal''. If e.g. $p\le 0.05$ then we take it as sufficient evidence that the null hypothesis is not true. Therefore, we assume that the means are not equal (which is what you want to show). In this exercise we will look at what kind of p-values we expect if the null hypothesis is true. In our example, this would be the case if the true means of two datasets are actually equal. \begin{parts} \part Think about how you expect the p-values to behave in that situation. \part Simulate the situation in which the means are equal by repeating the following at least $1000$ times: \begin{enumerate} \item Generate two arrays {\tt x} and {\tt y} with $10$ normally (Gaussian) distributed random numbers using {\tt randn}. By construction, the true means behind the random number are zero. \item Perform a two sample t-test ({\tt ttest2}) on {\tt x} and {\tt y}. Store the p-value. \end{enumerate} \part Plot a histogram of the $1000$ p-values. What do you think is the distribution the p-values (i.e. if you repeated this experiment many more times, how would the histogram look like)? \part Given what you find, think about whether the following strategy is statistically valid: You collect $10$ data points from each group and perform a test. If the test is not significant, you collect $10$ more and repeat the test. If the test tells you that there is a significant difference you stop. Otherwise you repeat the procedure until the test is significant. \end{parts} \end{questions} \end{document}