[likelihood] finished exercises

This commit is contained in:
Jan Benda 2018-12-17 22:57:39 +01:00
parent 18ca54e94d
commit deed303596
3 changed files with 133 additions and 92 deletions

View File

@ -15,7 +15,7 @@
\else \else
\newcommand{\stitle}{} \newcommand{\stitle}{}
\fi \fi
\header{{\bfseries\large Exercise 12\stitle}}{{\bfseries\large Maximum Likelihood}}{{\bfseries\large January 7th, 2019}} \header{{\bfseries\large Exercise 12\stitle}}{{\bfseries\large Maximum likelihood}}{{\bfseries\large January 7th, 2019}}
\firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email: \firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email:
jan.benda@uni-tuebingen.de} jan.benda@uni-tuebingen.de}
\runningfooter{}{\thepage}{} \runningfooter{}{\thepage}{}
@ -93,14 +93,14 @@ jan.benda@uni-tuebingen.de}
Let's compute the likelihood and the log-likelihood for the estimation Let's compute the likelihood and the log-likelihood for the estimation
of the standard deviation. of the standard deviation.
\begin{parts} \begin{parts}
\part Draw $n=50$ normaly distributed random numbers with mean \part Draw $n=50$ random numbers from a normal distribution with
$\mu=3$ and standard deviation $\sigma=2$. mean $\mu=3$ and standard deviation $\sigma=2$.
\part Plot the likelihood (computed as the product of probabilities) \part Plot the likelihood (computed as the product of probabilities)
and the log-likelihood (sum of the logarithms of the probabilities) and the log-likelihood (sum of the logarithms of the probabilities)
using the standard deviation as the parameter we want to estimate as a function of the standard deviation. Compare the position of the
from the data. Compare the position of the maxima with the standard maxima with the standard deviation that you compute directly from
deviation that you can compute from the data. the data.
\part Increase $n$ to 1000. What happens to the likelihood, what \part Increase $n$ to 1000. What happens to the likelihood, what
happens to the log-likelihood? Why? happens to the log-likelihood? Why?
@ -111,75 +111,86 @@ of the standard deviation.
\end{solution} \end{solution}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\question \qt{Maximum-Likelihood-Sch\"atzer einer Ursprungsgeraden} \question \qt{Maximum-likelihood estimator of a line through the origin}
In der Vorlesung haben wir folgende Formel f\"ur die Maximum-Likelihood In the lecture we derived the following equation for an
Absch\"atzung der Steigung $\theta$ einer Ursprungsgeraden durch $n$ Datenpunkte $(x_i|y_i)$ mit Standardabweichung $\sigma_i$ hergeleitet: maximum-likelihood estimate of the slope $\theta$ of a straight line
\[\theta = \frac{\sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2}}{ \sum_{i=1}^n through the origin fitted to $n$ pairs of data values $(x_i|y_i)$ with
standard deviation $\sigma_i$:
\[\theta = \frac{\sum_{i=1}^n \frac{x_i y_i}{\sigma_i^2}}{ \sum_{i=1}^n
\frac{x_i^2}{\sigma_i^2}} \] \frac{x_i^2}{\sigma_i^2}} \]
\begin{parts} \begin{parts}
\part \label{mleslopefunc} Schreibe eine Funktion, die in einem $x$ und einem \part \label{mleslopefunc} Write a function that takes two vectors
$y$ Vektor die Datenpaare \"uberreicht bekommt und die Steigung der $x$ and $y$ containing the data pairs and returns the slope,
Ursprungsgeraden, die die Likelihood maximiert, zur\"uckgibt computed according to this equation. For simplicity we assume
($\sigma=\text{const}$). $\sigma_i=\sigma_j=\sigma$ for all $1 \le i \le n$ and $1 \le j \le
n$. How does this simplify the equation for the slope?
\part \begin{solution}
Schreibe ein Skript, das Datenpaare erzeugt, die um eine \lstinputlisting{mleslope.m}
Ursprungsgerade mit vorgegebener Steigung streuen. Berechne mit der \end{solution}
Funktion aus \pref{mleslopefunc} die Steigung aus den Daten,
vergleiche mit der wahren Steigung, und plotte die urspr\"ungliche \part Write a script that generates data pairs that scatter around a
sowie die gefittete Gerade zusammen mit den Daten. line through the origin with a given slope. Use the function from
\pref{mleslopefunc} to compute the slope from the generated data.
\part Compare the computed slope with the true slope that has been used to
Ver\"andere die Anzahl der Datenpunkte, die Steigung, sowie die generate the data. Plot the data togehther with the line from which
Streuung der Daten um die Gerade. the data were generated and the maximum-likelihood fit.
\begin{solution}
\lstinputlisting{mlepropfit.m}
\includegraphics[width=1\textwidth]{mlepropfit}
\end{solution}
\part \label{mleslopecomp} Vary the number of data pairs, the slope,
as well as the variance of the data points around the true
line. Under which conditions is the maximum-likelihood estimation of
the slope closer to the true slope?
\part To answer \pref{mleslopecomp} more precisely, generate for
each condition let's say 1000 data sets and plot a histogram of the
estimated slopes. How does the histogram, its mean and standard
deviation relate to the true slope?
\end{parts} \end{parts}
\begin{solution} \begin{solution}
\lstinputlisting{mleslope.m} \lstinputlisting{mlepropest.m}
\lstinputlisting{mlepropfit.m} \includegraphics[width=1\textwidth]{mlepropest}
\includegraphics[width=1\textwidth]{mlepropfit}
\end{solution} \end{solution}
\continue
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\question \qt{Maximum-Likelihood-Sch\"atzer einer Wahrscheinlichkeitsdichtefunktion} \question \qt{Maximum-likelihood-estimation of a probability-density function}
Verschiedene Wahrscheinlichkeitsdichtefunktionen haben Parameter, die Many probability-density functions have parameters that cannot be
nicht so einfach wie der Mittelwert und die Standardabweichung einer computed directly from the data, like, for example, the mean of
Normalverteilung direkt aus den Daten berechnet werden k\"onnen. Solche Parameter normally-distributed data. Such parameter need to be estimated by
m\"ussen dann aus den Daten mit der Maximum-Likelihood-Methode gefittet werden. means of the maximum-likelihood from the data.
Um dies zu veranschaulichen ziehen wir uns diesmal nicht normalverteilte Zufallszahlen, sondern Zufallszahlen aus der Gamma-Verteilung. Let us demonstrate this approach by means of data that are drawn from a
gamma distribution,
\begin{parts} \begin{parts}
\part \part Find out which \code{matlab} function computes the
Finde heraus welche \code{matlab} Funktion die probability-density function of the gamma distribution.
Wahrscheinlichkeitsdichtefunktion (probability density function) der
Gamma-Verteilung berechnet. \part \label{gammaplot} Use this function to plot the
probability-density function of the gamma distribution for various
\part values of the (positive) ``shape'' parameter. Wet set the ``scale''
Plotte mit Hilfe dieser Funktion die Wahrscheinlichkeitsdichtefunktion parameter to one.
der Gamma-Verteilung f\"ur verschiedene Werte des (positiven) ``shape'' Parameters.
Den ``scale'' Parameter setzen wir auf Eins. \part Find out which \code{matlab} function generates random numbers
that are distributed according to a gamma distribution. Generate
\part with this function 50 random numbers using one of the values of the
Finde heraus mit welcher Funktion Gammaverteilte Zufallszahlen in ``shape'' parameter used in \pref{gammaplot}.
\code{matlab} gezogen werden k\"onnen. Erzeuge mit dieser Funktion
50 Zufallszahlen mit einem der oben geplotteten ``shape'' Parameter. \part Compute and plot a properly normalized histogram of these
random numbers.
\part
Berechne und plotte ein normiertes Histogramm dieser Zufallszahlen. \part Find out which \code{matlab} function fit a distribution to a
vector of random numbers according to the maximum-likelihood method.
\part How do you need to use this function in order to fit a gamma
Finde heraus mit welcher \code{matlab}-Funktion eine beliebige distribution to the data?
Verteilung (``distribution'') an die Zufallszahlen nach der
Maximum-Likelihood Methode gefittet werden kann. Wie wird diese \part Estimate with this function the parameter of the gamma
Funktion benutzt, um die Gammaverteilung an die Daten zu fitten? distribution used to generate the data.
\part \part Finally, plot the fitted gamma distribution on top of the
Bestimme mit dieser Funktion die Parameter der Gammaverteilung aus normalized histogram of the data.
den Zufallszahlen.
\part
Plotte anschlie{\ss}end die Gammaverteilung mit den gefitteten
Parametern.
\end{parts} \end{parts}
\begin{solution} \begin{solution}
\lstinputlisting{mlepdffit.m} \lstinputlisting{mlepdffit.m}

View File

@ -0,0 +1,25 @@
m = 2.0; % slope
sigmas = [0.1, 1.0]; % standard deviations
ns = [100, 1000]; % number of data pairs
trials = 1000; % number of data sets
for i = 1:length(sigmas)
sigma = sigmas(i);
for j = 1:length(ns)
n = ns(j);
slopes = zeros(trials, 1);
for k=1:trials
% data pairs:
x = 5.0*rand(n, 1);
y = m*x + sigma*randn(n, 1);
% fit:
slopes(k) = mleslope(x, y);
end
subplot(2, 2, 2*(i-1)+j);
bins = [1.9:0.005:2.1];
hist(slopes, bins);
title(sprintf('sigma=%g, n=%d', sigma, n));
end
end
savefigpdf(gcf, 'mlepropest.pdf', 12, 7);

View File

@ -1,30 +1,35 @@
% draw random numbers:
n = 50;
mu = 3.0; mu = 3.0;
sigma =2.0; sigma =2.0;
x = randn(n,1)*sigma+mu; ns = [50, 1000];
fprintf(' mean of the data is %.2f\n', mean(x)) for k = 1:length(ns)
fprintf('standard deviation of the data is %.2f\n', std(x)) n = ns(k);
% draw random numbers:
x = randn(n,1)*sigma+mu;
fprintf(' mean of the data is %.2f\n', mean(x))
fprintf('standard deviation of the data is %.2f\n', std(x))
% standard deviation as parameter: % standard deviation as parameter:
psigs = 1.0:0.01:3.0; psigs = 1.0:0.01:3.0;
% matrix with the probabilities for each x and psigs: % matrix with the probabilities for each x and psigs:
lms = zeros(length(x), length(psigs)); lms = zeros(length(x), length(psigs));
for i=1:length(psigs) for i=1:length(psigs)
psig = psigs(i); psig = psigs(i);
p = exp(-0.5*((x-mu)/psig).^2.0)/sqrt(2.0*pi)/psig; p = exp(-0.5*((x-mu)/psig).^2.0)/sqrt(2.0*pi)/psig;
lms(:,i) = p; lms(:,i) = p;
end end
lm = prod(lms, 1); % likelihood lm = prod(lms, 1); % likelihood
loglm = sum(log(lms), 1); % log likelihood loglm = sum(log(lms), 1); % log likelihood
% plot likelihood of standard deviation: % plot likelihood of standard deviation:
subplot(1, 2, 1); subplot(2, 2, 2*k-1);
plot(psigs, lm ); plot(psigs, lm );
xlabel('standard deviation') title(sprintf('likelihood n=%d', n));
ylabel('likelihood') xlabel('standard deviation')
subplot(1, 2, 2); ylabel('likelihood')
plot(psigs, loglm); subplot(2, 2, 2*k);
xlabel('standard deviation') plot(psigs, loglm);
ylabel('log likelihood') title(sprintf('log-likelihood n=%d', n));
xlabel('standard deviation')
ylabel('log likelihood')
end
savefigpdf(gcf, 'mlestd.pdf', 15, 5); savefigpdf(gcf, 'mlestd.pdf', 15, 5);