\documentclass[12pt,a4paper,pdftex]{exam} \usepackage[english]{babel} \usepackage{pslatex} \usepackage[mediumspace,mediumqspace,Gray]{SIunits} % \ohm, \micro \usepackage{xcolor} \usepackage{graphicx} \usepackage[breaklinks=true,bookmarks=true,bookmarksopen=true,pdfpagemode=UseNone,pdfstartview=FitH,colorlinks=true,citecolor=blue]{hyperref} %%%%% layout %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \usepackage[left=20mm,right=20mm,top=25mm,bottom=25mm]{geometry} \pagestyle{headandfoot} \ifprintanswers \newcommand{\stitle}{: Solutions} \else \newcommand{\stitle}{} \fi \header{{\bfseries\large Exercise 7\stitle}}{{\bfseries\large Statistics}}{{\bfseries\large November 13th, 2018}} \firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email: jan.benda@uni-tuebingen.de} \runningfooter{}{\thepage}{} \setlength{\baselineskip}{15pt} \setlength{\parindent}{0.0cm} \setlength{\parskip}{0.3cm} \renewcommand{\baselinestretch}{1.15} %%%%% listings %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \usepackage{listings} \lstset{ language=Matlab, basicstyle=\ttfamily\footnotesize, numbers=left, numberstyle=\tiny, title=\lstname, showstringspaces=false, commentstyle=\itshape\color{darkgray}, breaklines=true, breakautoindent=true, columns=flexible, frame=single, xleftmargin=1em, xrightmargin=1em, aboveskip=10pt } %%%%% math stuff: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \usepackage{amsmath} \usepackage{amssymb} \usepackage{bm} \usepackage{dsfont} \newcommand{\naZ}{\mathds{N}} \newcommand{\gaZ}{\mathds{Z}} \newcommand{\raZ}{\mathds{Q}} \newcommand{\reZ}{\mathds{R}} \newcommand{\reZp}{\mathds{R^+}} \newcommand{\reZpN}{\mathds{R^+_0}} \newcommand{\koZ}{\mathds{C}} %%%%% page breaks %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\continue}{\ifprintanswers% \else \vfill\hspace*{\fill}$\rightarrow$\newpage% \fi} \newcommand{\continuepage}{\ifprintanswers% \newpage \else \vfill\hspace*{\fill}$\rightarrow$\newpage% \fi} \newcommand{\newsolutionpage}{\ifprintanswers% \newpage% \else \fi} %%%%% new commands %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \newcommand{\qt}[1]{\textbf{#1}\\} \newcommand{\pref}[1]{(\ref{#1})} \newcommand{\extra}{--- bonus question ---\ \mbox{}} \newcommand{\code}[1]{\texttt{#1}} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \begin{document} \input{instructions} \ifprintanswers% \else \begin{itemize} \item Convince yourself that each single line of your code really does what it should do! Test it with small examples directly in the command line. \item Always try to break down your solution into small and meaningful functions. As soon something similar is computed more than once you should definitely put it into a function. \item Initially test computationally expensive \code{for} loops, vectors, matrices, etc. with small numbers of repetitions and/or sizes. Once it is working use large repetitions and/or sizes for getting a good statistics. \item Use the help functions of \code{matlab} (\code{help command} or \code{doc command}) and the internet to figure out how specific \code{matlab} functions are used and what features they offer. In addition, the internet offers a lot of material and suggestions for any question you have regarding your code ! \item Please upload your solution to the exercises to ILIAS as a zip-archive with the name ``statistics\_\{last name\}\_\{first name\}.zip''. \end{itemize} \fi \begin{questions} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \question \textbf{Read chapter 4 of the script on ``programming style''!} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \question \qt{Probabilities of a die} The computer can roll dice with more than 6 faces! \begin{parts} \part Simulate 10000 times rolling a die with eight faces by generating integer random numbers $x_i = 1, 2, \ldots 8$ . \part Compute the probability $P(5)$ of getting a five by counting the number of fives occurring in the data set. Does the result fit to your expectation? Check the probabilities $P(x_i)$ of the other numbers. Is the die a fair die? \part Store the computed probabilities $P(x_i)$ in a vector and use the \code{bar()} function for plotting the probabilities as a function of the corresponding face values. \part Compute a normalized histogram of the face values by means of the \code{hist()} and \code{bar()} functions. \part \extra Simulate a loaded die with the six showing up three-times as often as the other numbers. Compute a normalized histogram of the face values from rolling the loaded die 10000 times. \end{parts} \begin{solution} \lstinputlisting{rollthedie.m} \lstinputlisting{diehist.m} \lstinputlisting{die1.m} \includegraphics[width=1\textwidth]{die1} \end{solution} \continue %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \question \qt{Histogram of the normal distribution} \vspace{-3ex} \begin{parts} \part Generate a data set $X = (x_1, x_2, ... x_n)$ of $n=10000$ normally distributed random numbers with mean $\mu=0$ and standard deviation $\sigma=1$ (\code{randn()} function). \part Compute from this data set the probability $P(0\le x<0.5)$. \part What happens to the probability of drawing a number from a specific range (z.B. $P(0\le x<a)$), if this range gets smaller and smaller, i.e. $a \to 0$? Write a script that illustrates this by plotting $P(0\le x<a)$ as a function of $a$ (use $0 \le a \le 4$). \part \label{manualpdf} Compute and plot the probability density of the data set (the normalized histogram). First, define the positions of the bins (width of 0.5) in a vector. Count in a \code{for} loop for each bin die number of data values falling into the bin. Finally, normalize the resulting histogram and plot it using the \code{bar()} function. \part \label{gaussianpdf} Draw into the same plot the normal distribution \[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \] for a comparison. \part Plot the probability density as in (\ref{manualpdf}) and (\ref{gaussianpdf}), but this time by means of the \code{hist()} and \code{bar()} functions. \end{parts} \begin{solution} \lstinputlisting{normhist.m} \includegraphics[width=1\textwidth]{normhist} \end{solution} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \question \qt{Probabilities of a normal distribution} Which fraction of a normally distributed data set is contained in ranges that are symmetric around the mean? \begin{parts} \part Generate a data set $X = (x_1, x_2, ... x_n)$ of $n=10000$ normally distributed numbers with mean $\mu=0$ and standard deviation $\sigma=1$ (\code{randn() function}). % \part Estimate and plot the probability density of this data set (normalized histogram). % For a comparison plot the normal distribution % \[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \] % into the same plot. \part \label{onesigma} How many data values are at maximum one standard deviation away from the mean?\\ That is, how many data values $x_i$ have the value $-\sigma < x_i < +\sigma$?\\ Compute the probability $P_{\pm\sigma}$ to get a value in this range by counting how many data points fall into this range. \part \label{probintegral} Compute the probability of $-\sigma < x_i < +\sigma$ by numerically integrating over the probability density of the normal distribution \[ P_{\pm\sigma}=\int_{x=\mu-\sigma}^{x=\mu+\sigma} p_g(x) \, dx \; .\] First check whether \[ \int_{-\infty}^{+\infty} p_g(x) \, dx = 1 \; . \] Why is this the case? \part What fraction of the data is contained in the intervals $\pm 2\sigma$ and $\pm 3\sigma$? Compare the results with the corresponding integrals over the normal distribution. \part \label{givenfraction} Find out which intervals, that are symmetric with respect to the mean, contain 50\,\%, 90\,\%, 95\,\% and 99\,\% of the data by means of numeric integration of the normal distribution. % \part \extra Modify the code of questions \pref{onesigma} -- \pref{givenfraction} such % that it works for data sets with arbitrary mean and arbitrary standard deviation.\\ % Check your code with different sets of random numbers.\\ % How do you generate random numbers of a given mean and standard % deviation using the \code{randn()} function? \end{parts} \begin{solution} \lstinputlisting{normprobs.m} \end{solution} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \question \qt{Central limit theorem} According to the central limit theorem the sum of independent and identically distributed (i.i.d.) random variables converges towards a normal distribution, although the distribution of the randmon variables might not be normally distributed. With the following questions we want to illustrate the central limit theorem. \begin{parts} \part Before you continue reading, try to figure out yourself what the central limit theorem means and what you would need to do for illustrating this theorem. \part Draw 10000 random numbers that are uniformly distributed between 0 and 1 (\code{rand} function). \part Plot their probability density (normalized histogram). \part Draw another set of 10000 uniformly distributed random numbers and add them to the first set of numbers. \part Plot the probability density of the summed up random numbers. \part Repeat steps (d) and (e) many times. \part Compare in a plot the probability density of the summed up numbers with the normal distribution \[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}\] with mean $\mu$ and standard deviation $\sigma$ of the summed up random numbers. \part How do the mean and the standard deviation change with the number of summed up data sets? \part \extra Check the central limit theorem in the same way using exponentially distributed random numbers (\code{rande} function). \end{parts} \begin{solution} \lstinputlisting{centrallimit.m} \includegraphics[width=0.5\textwidth]{centrallimit-hist01} \includegraphics[width=0.5\textwidth]{centrallimit-hist02} \includegraphics[width=0.5\textwidth]{centrallimit-hist03} \includegraphics[width=0.5\textwidth]{centrallimit-hist05} \includegraphics[width=0.5\textwidth]{centrallimit-samples} \end{solution} \end{questions} \end{document}