294 lines
10 KiB
TeX
294 lines
10 KiB
TeX
\documentclass[12pt,a4paper,pdftex]{exam}
|
|
|
|
\usepackage[english]{babel}
|
|
\usepackage{pslatex}
|
|
\usepackage[mediumspace,mediumqspace,Gray]{SIunits} % \ohm, \micro
|
|
\usepackage{xcolor}
|
|
\usepackage{graphicx}
|
|
\usepackage[breaklinks=true,bookmarks=true,bookmarksopen=true,pdfpagemode=UseNone,pdfstartview=FitH,colorlinks=true,citecolor=blue]{hyperref}
|
|
|
|
%%%%% layout %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\usepackage[left=20mm,right=20mm,top=25mm,bottom=25mm]{geometry}
|
|
\pagestyle{headandfoot}
|
|
\ifprintanswers
|
|
\newcommand{\stitle}{: Solutions}
|
|
\else
|
|
\newcommand{\stitle}{}
|
|
\fi
|
|
\header{{\bfseries\large Exercise 8\stitle}}{{\bfseries\large Statistics}}{{\bfseries\large December 2nd, 2019}}
|
|
\firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email:
|
|
jan.benda@uni-tuebingen.de}
|
|
\runningfooter{}{\thepage}{}
|
|
|
|
\setlength{\baselineskip}{15pt}
|
|
\setlength{\parindent}{0.0cm}
|
|
\setlength{\parskip}{0.3cm}
|
|
\renewcommand{\baselinestretch}{1.15}
|
|
|
|
%%%%% listings %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\usepackage{listings}
|
|
\lstset{
|
|
language=Matlab,
|
|
basicstyle=\ttfamily\footnotesize,
|
|
numbers=left,
|
|
numberstyle=\tiny,
|
|
title=\lstname,
|
|
showstringspaces=false,
|
|
commentstyle=\itshape\color{darkgray},
|
|
breaklines=true,
|
|
breakautoindent=true,
|
|
columns=flexible,
|
|
frame=single,
|
|
xleftmargin=1em,
|
|
xrightmargin=1em,
|
|
aboveskip=10pt
|
|
}
|
|
|
|
%%%%% math stuff: %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\usepackage{amsmath}
|
|
\usepackage{amssymb}
|
|
\usepackage{bm}
|
|
\usepackage{dsfont}
|
|
\newcommand{\naZ}{\mathds{N}}
|
|
\newcommand{\gaZ}{\mathds{Z}}
|
|
\newcommand{\raZ}{\mathds{Q}}
|
|
\newcommand{\reZ}{\mathds{R}}
|
|
\newcommand{\reZp}{\mathds{R^+}}
|
|
\newcommand{\reZpN}{\mathds{R^+_0}}
|
|
\newcommand{\koZ}{\mathds{C}}
|
|
|
|
%%%%% page breaks %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\newcommand{\continue}{\ifprintanswers%
|
|
\else
|
|
\vfill\hspace*{\fill}$\rightarrow$\newpage%
|
|
\fi}
|
|
\newcommand{\continuepage}{\ifprintanswers%
|
|
\newpage
|
|
\else
|
|
\vfill\hspace*{\fill}$\rightarrow$\newpage%
|
|
\fi}
|
|
\newcommand{\newsolutionpage}{\ifprintanswers%
|
|
\newpage%
|
|
\else
|
|
\fi}
|
|
|
|
%%%%% new commands %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\newcommand{\qt}[1]{\textbf{#1}\\}
|
|
\newcommand{\pref}[1]{(\ref{#1})}
|
|
\newcommand{\extra}{--- bonus question ---\ \mbox{}}
|
|
\newcommand{\code}[1]{\texttt{#1}}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\begin{document}
|
|
|
|
\input{instructions}
|
|
|
|
\ifprintanswers%
|
|
\else
|
|
|
|
\begin{itemize}
|
|
\item Convince yourself that each single line of your code really does
|
|
what it should do! Test it with small examples directly in the
|
|
command line.
|
|
\item Always try to break down your solution into small and meaningful
|
|
functions. As soon something similar is computed more than once you
|
|
should definitely put it into a function.
|
|
\item Initially test computationally expensive \code{for} loops, vectors,
|
|
matrices, etc. with small numbers of repetitions and/or
|
|
sizes. Once it is working use large repetitions and/or sizes for
|
|
getting a good statistics.
|
|
\item Use the help functions of \code{matlab} (\code{help command} or
|
|
\code{doc command}) and the internet to figure out how specific
|
|
\code{matlab} functions are used and what features they offer. In
|
|
addition, the internet offers a lot of material and suggestions for
|
|
any question you have regarding your code !
|
|
\item Please upload your solution to the exercises to ILIAS as a zip-archive with the name
|
|
``statistics\_\{last name\}\_\{first name\}.zip''.
|
|
\end{itemize}
|
|
|
|
\fi
|
|
|
|
|
|
\begin{questions}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \textbf{Read chapter 4 of the script on ``code style''!}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Probabilities of a die}
|
|
The computer can roll dice with more than 6 faces!
|
|
\begin{parts}
|
|
\part Simulate 10000 times rolling a die with eight faces by
|
|
generating integer random numbers $x_i = 1, 2, \ldots 8$ .
|
|
|
|
\part Compute the probability $P(5)$ of getting a five by counting the number of fives
|
|
occurring in the data set.
|
|
|
|
Does the result fit to your expectation?
|
|
|
|
Check the probabilities $P(x_i)$ of the other numbers.
|
|
|
|
Is the die a fair die?
|
|
|
|
\part Store the computed probabilities $P(x_i)$ in a vector and use
|
|
the \code{bar()} function for plotting the probabilities as a
|
|
function of the corresponding face values.
|
|
|
|
\part Compute a normalized histogram of the face values by means of
|
|
the \code{hist()} and \code{bar()} functions.
|
|
|
|
\part \extra Simulate a loaded die with the six showing up
|
|
three-times as often as the other numbers.
|
|
|
|
Compute a normalized histogram of the face values from rolling the loaded die 10000 times.
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{rollthedie.m}
|
|
\lstinputlisting{diehist.m}
|
|
\lstinputlisting{die1.m}
|
|
\includegraphics[width=1\textwidth]{die1}
|
|
\end{solution}
|
|
|
|
|
|
\continue
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Histogram of the normal distribution}
|
|
\vspace{-3ex}
|
|
\begin{parts}
|
|
\part Generate a data set $X = (x_1, x_2, ... x_n)$ of
|
|
$n=10000$ normally distributed random numbers with mean $\mu=0$ and
|
|
standard deviation $\sigma=1$ (\code{randn()} function).
|
|
|
|
\part Compute from this data set the probability $P(0\le x<0.5)$.
|
|
|
|
\part What happens to the probability of drawing a number from a
|
|
specific range (z.B. $P(0\le x<a)$), if this range gets smaller and
|
|
smaller, i.e. $a \to 0$?
|
|
|
|
Write a script that illustrates this by plotting $P(0\le x<a)$
|
|
as a function of $a$ (use $0 \le a \le 4$).
|
|
|
|
\part \label{manualpdf} Compute and plot the probability density of
|
|
the data set (the normalized histogram). First, define the positions
|
|
of the bins (width of 0.5) in a vector. Count in a \code{for} loop
|
|
for each bin die number of data values falling into the
|
|
bin. Finally, normalize the resulting histogram and plot it using
|
|
the \code{bar()} function.
|
|
|
|
\part \label{gaussianpdf} Draw into the same plot the normal
|
|
distribution
|
|
\[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \]
|
|
for a comparison.
|
|
|
|
\part Plot the probability density as in (\ref{manualpdf}) and
|
|
(\ref{gaussianpdf}), but this time by means of the \code{hist()} and
|
|
\code{bar()} functions.
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{normhist.m}
|
|
\includegraphics[width=1\textwidth]{normhist}
|
|
\end{solution}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Probabilities of a normal distribution}
|
|
Which fraction of a normally distributed data set is contained in ranges
|
|
that are symmetric around the mean?
|
|
\begin{parts}
|
|
\part Generate a data set $X = (x_1, x_2, ... x_n)$ of
|
|
$n=10000$ normally distributed numbers with mean $\mu=0$ and
|
|
standard deviation $\sigma=1$ (\code{randn() function}).
|
|
% \part Estimate and plot the probability density of this data set (normalized histogram).
|
|
% For a comparison plot the normal distribution
|
|
% \[ p_g(x) = \frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2} \]
|
|
% into the same plot.
|
|
|
|
\part \label{onesigma} How many data values are at maximum one standard deviation
|
|
away from the mean?\\
|
|
That is, how many data values $x_i$ have the value $-\sigma < x_i < +\sigma$?\\
|
|
Compute the probability $P_{\pm\sigma}$ to get a value in this range
|
|
by counting how many data points fall into this range.
|
|
|
|
\part \label{probintegral} Compute the probability of
|
|
$-\sigma < x_i < +\sigma$ by numerically integrating over the
|
|
probability density of the normal distribution
|
|
\[ P_{\pm\sigma}=\int_{x=\mu-\sigma}^{x=\mu+\sigma} p_g(x) \, dx \; .\]
|
|
First check whether
|
|
\[ \int_{-\infty}^{+\infty} p_g(x) \, dx = 1 \; . \]
|
|
Why is this the case?
|
|
|
|
\part What fraction of the data is contained in the intervals $\pm 2\sigma$
|
|
and $\pm 3\sigma$?
|
|
|
|
Compare the results with the corresponding integrals over the normal
|
|
distribution.
|
|
|
|
\part \label{givenfraction} Find out which intervals, that are
|
|
symmetric with respect to the mean, contain 50\,\%, 90\,\%, 95\,\% and 99\,\%
|
|
of the data by means of numeric integration of the normal
|
|
distribution.
|
|
|
|
% \part \extra Modify the code of questions \pref{onesigma} -- \pref{givenfraction} such
|
|
% that it works for data sets with arbitrary mean and arbitrary standard deviation.\\
|
|
% Check your code with different sets of random numbers.\\
|
|
% How do you generate random numbers of a given mean and standard
|
|
% deviation using the \code{randn()} function?
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{normprobs.m}
|
|
\end{solution}
|
|
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Central limit theorem}
|
|
According to the central limit theorem the sum of independent and
|
|
identically distributed (i.i.d.) random variables converges towards a
|
|
normal distribution, although the distribution of the randmon
|
|
variables might not be normally distributed.
|
|
|
|
With the following questions we want to illustrate the central limit theorem.
|
|
\begin{parts}
|
|
\part Before you continue reading, try to figure out yourself what
|
|
the central limit theorem means and what you would need to do for
|
|
illustrating this theorem.
|
|
|
|
\part Draw 10000 random numbers that are uniformly distributed between 0 and 1
|
|
(\code{rand} function).
|
|
|
|
\part Plot their probability density (normalized histogram).
|
|
|
|
\part Draw another set of 10000 uniformly distributed random numbers
|
|
and add them to the first set of numbers.
|
|
|
|
\part Plot the probability density of the summed up random numbers.
|
|
|
|
\part Repeat steps (d) and (e) many times.
|
|
|
|
\part Compare in a plot the probability density of the summed up
|
|
numbers with the normal distribution
|
|
\[ p_g(x) =
|
|
\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2}\]
|
|
with mean $\mu$ and standard deviation $\sigma$ of the summed up random numbers.
|
|
|
|
\part How do the mean and the standard deviation change with the
|
|
number of summed up data sets?
|
|
|
|
\part \extra Check the central limit theorem in the same way using
|
|
exponentially distributed random numbers (\code{rande} function).
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{centrallimit.m}
|
|
\includegraphics[width=0.5\textwidth]{centrallimit-hist01}
|
|
\includegraphics[width=0.5\textwidth]{centrallimit-hist02}
|
|
\includegraphics[width=0.5\textwidth]{centrallimit-hist03}
|
|
\includegraphics[width=0.5\textwidth]{centrallimit-hist05}
|
|
\includegraphics[width=0.5\textwidth]{centrallimit-samples}
|
|
\end{solution}
|
|
|
|
|
|
\end{questions}
|
|
|
|
\end{document} |