145 lines
		
	
	
		
			5.7 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			145 lines
		
	
	
		
			5.7 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
| \documentclass[12pt,a4paper,pdftex]{exam}
 | |
| 
 | |
| \newcommand{\exercisetopic}{Maximum Likelihood}
 | |
| \newcommand{\exercisenum}{10}
 | |
| \newcommand{\exercisedate}{January 12th, 2021}
 | |
| 
 | |
| \input{../../exercisesheader}
 | |
| 
 | |
| \firstpagefooter{Prof. Dr. Jan Benda}{}{jan.benda@uni-tuebingen.de}
 | |
| 
 | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 | |
| \begin{document}
 | |
| 
 | |
| \input{../../exercisestitle}
 | |
| 
 | |
| \begin{questions}
 | |
| 
 | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
 | |
| \question \qt{Read chapter 9 on ``Maximum likelihood estimation''!}\vspace{-3ex}
 | |
| 
 | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 | |
| \question \qt{Maximum likelihood of the standard deviation}
 | |
| Let's compute the likelihood and the log-likelihood for the estimation
 | |
| of the standard deviation.
 | |
| \begin{parts}
 | |
|   \part Draw $n=50$ random numbers from a normal distribution with
 | |
|   mean $\mu=3$ and standard deviation $\sigma=2$.
 | |
| 
 | |
|   \part Plot the likelihood (computed as the product of probabilities)
 | |
|   and the log-likelihood (sum of the logarithms of the probabilities)
 | |
|   as a function of the standard deviation. Compare the position of the
 | |
|   maxima with the standard deviation that you compute directly from
 | |
|   the data.
 | |
| 
 | |
|   \part Increase $n$ to 1000. What happens to the likelihood, what
 | |
|   happens to the log-likelihood? Why?
 | |
| \end{parts}
 | |
| \begin{solution}
 | |
|   \lstinputlisting{mlestd.m}
 | |
|   \includegraphics[width=1\textwidth]{mlestd}\\
 | |
| 
 | |
|   The more data the smaller the product of the probabilities ($\approx
 | |
|   p^n$ with $0 \le p < 1$) and the smaller the sum of the logarithms
 | |
|   of the probabilities ($\approx n\log p$, note that $\log p < 0$).
 | |
| 
 | |
|   The product eventually gets smaller than the precision of the
 | |
|   floating point numbers support. Therefore for $n=1000$ the products
 | |
|   become zero. Using the logarithm avoids this numerical problem.
 | |
| \end{solution}
 | |
| 
 | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 | |
| \question \qt{Maximum-likelihood estimator of a line through the origin} 
 | |
| In the lecture we derived the following equation for an
 | |
| maximum-likelihood estimate of the slope $\theta$ of a straight line
 | |
| through the origin fitted to $n$ pairs of data values $(x_i|y_i)$ with
 | |
| standard deviation $\sigma_i$:
 | |
| \[\theta = \frac{\sum_{i=1}^n \frac{x_i y_i}{\sigma_i^2}}{ \sum_{i=1}^n
 | |
|   \frac{x_i^2}{\sigma_i^2}} \]
 | |
| \begin{parts}
 | |
|   \part \label{mleslopefunc} Write a function that takes two vectors
 | |
|   $x$ and $y$ containing the data pairs and returns the slope,
 | |
|   computed according to this equation. For simplicity we assume
 | |
|   $\sigma_i=\sigma$ for all $1 \le i \le n$. How does this simplify
 | |
|   the equation for the slope?
 | |
|   \begin{solution}
 | |
|     \lstinputlisting{mleslope.m}
 | |
|   \end{solution}
 | |
| 
 | |
|   \part Write a script that generates data pairs that scatter with a normal
 | |
|   distribution around a line through the origin with a given
 | |
|   slope. Use the function from \pref{mleslopefunc} to compute the
 | |
|   slope from the generated data.  Compare the computed slope with the
 | |
|   true slope that has been used to generate the data. Plot the data
 | |
|   together with the line from which the data were generated as well as
 | |
|   the maximum-likelihood fit.
 | |
|   \begin{solution}
 | |
|     \lstinputlisting{mlepropfit.m}
 | |
|     \includegraphics[width=1\textwidth]{mlepropfit}
 | |
|   \end{solution}
 | |
| 
 | |
|   \part \label{mleslopecomp} Vary the number of data pairs, the slope,
 | |
|   as well as the variance of the data points around the true
 | |
|   line. Under which conditions is the maximum-likelihood estimation of
 | |
|   the slope closer to the true slope?
 | |
| 
 | |
|   \part To answer \pref{mleslopecomp} more precisely, generate for
 | |
|   each condition let's say 1000 data sets and plot a histogram of the
 | |
|   estimated slopes. How does the histogram, its mean and standard
 | |
|   deviation relate to the true slope?
 | |
| \end{parts}
 | |
| \begin{solution}
 | |
|   \lstinputlisting{mlepropest.m}
 | |
|   \includegraphics[width=1\textwidth]{mlepropest}\\
 | |
|   The estimated slopes are centered around the true slope. The
 | |
|   standard deviation of the estimated slopes gets smaller for larger
 | |
|   $n$ and less noise in the data.
 | |
| \end{solution}
 | |
| 
 | |
| \continue
 | |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
 | |
| \question \qt{Maximum-likelihood-estimation of a probability-density function}
 | |
| Many probability-density functions have parameters that cannot be
 | |
| computed directly from the data as it is the case for the mean of
 | |
| normally-distributed data. Such parameter need to be estimated
 | |
| numerically by means of maximum-likelihood from the data.
 | |
| 
 | |
| Let us demonstrate this approach by means of data that are drawn from a
 | |
| gamma distribution,
 | |
| \begin{parts}
 | |
|   \part Find out which \code{matlab} function computes the
 | |
|   probability-density function of the gamma distribution.
 | |
| 
 | |
|   \part \label{gammaplot} Use this function to plot the
 | |
|   probability-density function of the gamma distribution for various
 | |
|   values of the (positive) ``shape'' parameter. Set the ``scale''
 | |
|   parameter to one.
 | |
| 
 | |
|   \part Find out which \code{matlab} function generates random numbers
 | |
|   that are distributed according to a gamma distribution. Generate
 | |
|   with this function 50 random numbers using one of the values of the
 | |
|   ``shape'' parameter used in \pref{gammaplot}.
 | |
| 
 | |
|   \part Compute and plot a properly normalized histogram of these
 | |
|   random numbers.
 | |
| 
 | |
|   \part Find out which \code{matlab} function fit a distribution to a
 | |
|   vector of random numbers according to the maximum-likelihood method.
 | |
|   How do you need to use this function in order to fit a gamma
 | |
|   distribution to the data?
 | |
| 
 | |
|   \part Estimate with this function the parameter of the gamma
 | |
|   distribution used to generate the data.
 | |
| 
 | |
|   \part Finally, plot the fitted gamma distribution on top of the
 | |
|   normalized histogram of the data.
 | |
| \end{parts}
 | |
| \begin{solution}
 | |
|   \lstinputlisting{mlepdffit.m}
 | |
|   \includegraphics[width=1\textwidth]{mlepdffit}
 | |
| \end{solution}
 | |
| 
 | |
| \end{questions}
 | |
| 
 | |
| \end{document}
 |