145 lines
5.7 KiB
TeX
145 lines
5.7 KiB
TeX
\documentclass[12pt,a4paper,pdftex]{exam}
|
|
|
|
\newcommand{\exercisetopic}{Maximum Likelihood}
|
|
\newcommand{\exercisenum}{10}
|
|
\newcommand{\exercisedate}{January 12th, 2021}
|
|
|
|
\input{../../exercisesheader}
|
|
|
|
\firstpagefooter{Prof. Dr. Jan Benda}{}{jan.benda@uni-tuebingen.de}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\begin{document}
|
|
|
|
\input{../../exercisestitle}
|
|
|
|
\begin{questions}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Read chapter 9 on ``Maximum likelihood estimation''!}\vspace{-3ex}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Maximum likelihood of the standard deviation}
|
|
Let's compute the likelihood and the log-likelihood for the estimation
|
|
of the standard deviation.
|
|
\begin{parts}
|
|
\part Draw $n=50$ random numbers from a normal distribution with
|
|
mean $\mu=3$ and standard deviation $\sigma=2$.
|
|
|
|
\part Plot the likelihood (computed as the product of probabilities)
|
|
and the log-likelihood (sum of the logarithms of the probabilities)
|
|
as a function of the standard deviation. Compare the position of the
|
|
maxima with the standard deviation that you compute directly from
|
|
the data.
|
|
|
|
\part Increase $n$ to 1000. What happens to the likelihood, what
|
|
happens to the log-likelihood? Why?
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{mlestd.m}
|
|
\includegraphics[width=1\textwidth]{mlestd}\\
|
|
|
|
The more data the smaller the product of the probabilities ($\approx
|
|
p^n$ with $0 \le p < 1$) and the smaller the sum of the logarithms
|
|
of the probabilities ($\approx n\log p$, note that $\log p < 0$).
|
|
|
|
The product eventually gets smaller than the precision of the
|
|
floating point numbers support. Therefore for $n=1000$ the products
|
|
become zero. Using the logarithm avoids this numerical problem.
|
|
\end{solution}
|
|
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Maximum-likelihood estimator of a line through the origin}
|
|
In the lecture we derived the following equation for an
|
|
maximum-likelihood estimate of the slope $m$ of a straight line
|
|
through the origin fitted to $n$ pairs of data values $(x_i|y_i)$ with
|
|
standard deviation $\sigma_i$:
|
|
\[ m = \frac{\sum_{i=1}^n \frac{x_i y_i}{\sigma_i^2}}{ \sum_{i=1}^n
|
|
\frac{x_i^2}{\sigma_i^2}} \]
|
|
\begin{parts}
|
|
\part \label{mleslopefunc} Write a function that takes two vectors
|
|
$x$ and $y$ containing the data pairs and returns the slope,
|
|
computed according to this equation. For simplicity we assume
|
|
$\sigma_i=\sigma$ for all $1 \le i \le n$. How does this simplify
|
|
the equation for the slope?
|
|
\begin{solution}
|
|
\lstinputlisting{mleslope.m}
|
|
\end{solution}
|
|
|
|
\part Write a script that generates data pairs that scatter with a normal
|
|
distribution around a line through the origin with a given
|
|
slope. Use the function from \pref{mleslopefunc} to compute the
|
|
slope from the generated data. Compare the computed slope with the
|
|
true slope that has been used to generate the data. Plot the data
|
|
together with the line from which the data were generated as well as
|
|
the maximum-likelihood fit.
|
|
\begin{solution}
|
|
\lstinputlisting{mlepropfit.m}
|
|
\includegraphics[width=1\textwidth]{mlepropfit}
|
|
\end{solution}
|
|
|
|
\part \label{mleslopecomp} Vary the number of data pairs, the slope,
|
|
as well as the variance of the data points around the true
|
|
line. Under which conditions is the maximum-likelihood estimation of
|
|
the slope closer to the true slope?
|
|
|
|
\part To answer \pref{mleslopecomp} more precisely, generate for
|
|
each condition let's say 1000 data sets and plot a histogram of the
|
|
estimated slopes. How does the histogram, its mean and standard
|
|
deviation relate to the true slope?
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{mlepropest.m}
|
|
\includegraphics[width=1\textwidth]{mlepropest}\\
|
|
The estimated slopes are centered around the true slope. The
|
|
standard deviation of the estimated slopes gets smaller for larger
|
|
$n$ and less noise in the data.
|
|
\end{solution}
|
|
|
|
\continue
|
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
\question \qt{Maximum-likelihood-estimation of a probability-density function}
|
|
Many probability-density functions have parameters that cannot be
|
|
computed directly from the data as it is the case for the mean of
|
|
normally-distributed data. Such parameter need to be estimated
|
|
numerically by means of maximum-likelihood from the data.
|
|
|
|
Let us demonstrate this approach by means of data that are drawn from a
|
|
gamma distribution.
|
|
\begin{parts}
|
|
\part Find out which \code{matlab} function computes the
|
|
probability-density function of the gamma distribution.
|
|
|
|
\part \label{gammaplot} Use this function to plot the
|
|
probability-density function of the gamma distribution for various
|
|
values of the (positive) ``shape'' parameter. Set the ``scale''
|
|
parameter to one.
|
|
|
|
\part Find out which \code{matlab} function generates random numbers
|
|
that are distributed according to a gamma distribution. Generate
|
|
with this function 50 random numbers using one of the values of the
|
|
``shape'' parameter used in \pref{gammaplot}.
|
|
|
|
\part Compute and plot a properly normalized histogram of these
|
|
random numbers.
|
|
|
|
\part Find out which \code{matlab} function fits a distribution to a
|
|
vector of random numbers according to the maximum-likelihood method.
|
|
How do you need to use this function in order to fit a gamma
|
|
distribution to the data?
|
|
|
|
\part Estimate with this function the parameter of the gamma
|
|
distribution used to generate the data.
|
|
|
|
\part Finally, plot the fitted gamma distribution on top of the
|
|
normalized histogram of the data.
|
|
\end{parts}
|
|
\begin{solution}
|
|
\lstinputlisting{mlepdffit.m}
|
|
\includegraphics[width=1\textwidth]{mlepdffit}
|
|
\end{solution}
|
|
|
|
\end{questions}
|
|
|
|
\end{document}
|