[likelihood] finished exercises
This commit is contained in:
parent
18ca54e94d
commit
deed303596
@ -15,7 +15,7 @@
|
|||||||
\else
|
\else
|
||||||
\newcommand{\stitle}{}
|
\newcommand{\stitle}{}
|
||||||
\fi
|
\fi
|
||||||
\header{{\bfseries\large Exercise 12\stitle}}{{\bfseries\large Maximum Likelihood}}{{\bfseries\large January 7th, 2019}}
|
\header{{\bfseries\large Exercise 12\stitle}}{{\bfseries\large Maximum likelihood}}{{\bfseries\large January 7th, 2019}}
|
||||||
\firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email:
|
\firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email:
|
||||||
jan.benda@uni-tuebingen.de}
|
jan.benda@uni-tuebingen.de}
|
||||||
\runningfooter{}{\thepage}{}
|
\runningfooter{}{\thepage}{}
|
||||||
@ -93,14 +93,14 @@ jan.benda@uni-tuebingen.de}
|
|||||||
Let's compute the likelihood and the log-likelihood for the estimation
|
Let's compute the likelihood and the log-likelihood for the estimation
|
||||||
of the standard deviation.
|
of the standard deviation.
|
||||||
\begin{parts}
|
\begin{parts}
|
||||||
\part Draw $n=50$ normaly distributed random numbers with mean
|
\part Draw $n=50$ random numbers from a normal distribution with
|
||||||
$\mu=3$ and standard deviation $\sigma=2$.
|
mean $\mu=3$ and standard deviation $\sigma=2$.
|
||||||
|
|
||||||
\part Plot the likelihood (computed as the product of probabilities)
|
\part Plot the likelihood (computed as the product of probabilities)
|
||||||
and the log-likelihood (sum of the logarithms of the probabilities)
|
and the log-likelihood (sum of the logarithms of the probabilities)
|
||||||
using the standard deviation as the parameter we want to estimate
|
as a function of the standard deviation. Compare the position of the
|
||||||
from the data. Compare the position of the maxima with the standard
|
maxima with the standard deviation that you compute directly from
|
||||||
deviation that you can compute from the data.
|
the data.
|
||||||
|
|
||||||
\part Increase $n$ to 1000. What happens to the likelihood, what
|
\part Increase $n$ to 1000. What happens to the likelihood, what
|
||||||
happens to the log-likelihood? Why?
|
happens to the log-likelihood? Why?
|
||||||
@ -111,75 +111,86 @@ of the standard deviation.
|
|||||||
\end{solution}
|
\end{solution}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\question \qt{Maximum-Likelihood-Sch\"atzer einer Ursprungsgeraden}
|
\question \qt{Maximum-likelihood estimator of a line through the origin}
|
||||||
In der Vorlesung haben wir folgende Formel f\"ur die Maximum-Likelihood
|
In the lecture we derived the following equation for an
|
||||||
Absch\"atzung der Steigung $\theta$ einer Ursprungsgeraden durch $n$ Datenpunkte $(x_i|y_i)$ mit Standardabweichung $\sigma_i$ hergeleitet:
|
maximum-likelihood estimate of the slope $\theta$ of a straight line
|
||||||
\[\theta = \frac{\sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2}}{ \sum_{i=1}^n
|
through the origin fitted to $n$ pairs of data values $(x_i|y_i)$ with
|
||||||
|
standard deviation $\sigma_i$:
|
||||||
|
\[\theta = \frac{\sum_{i=1}^n \frac{x_i y_i}{\sigma_i^2}}{ \sum_{i=1}^n
|
||||||
\frac{x_i^2}{\sigma_i^2}} \]
|
\frac{x_i^2}{\sigma_i^2}} \]
|
||||||
\begin{parts}
|
\begin{parts}
|
||||||
\part \label{mleslopefunc} Schreibe eine Funktion, die in einem $x$ und einem
|
\part \label{mleslopefunc} Write a function that takes two vectors
|
||||||
$y$ Vektor die Datenpaare \"uberreicht bekommt und die Steigung der
|
$x$ and $y$ containing the data pairs and returns the slope,
|
||||||
Ursprungsgeraden, die die Likelihood maximiert, zur\"uckgibt
|
computed according to this equation. For simplicity we assume
|
||||||
($\sigma=\text{const}$).
|
$\sigma_i=\sigma_j=\sigma$ for all $1 \le i \le n$ and $1 \le j \le
|
||||||
|
n$. How does this simplify the equation for the slope?
|
||||||
\part
|
\begin{solution}
|
||||||
Schreibe ein Skript, das Datenpaare erzeugt, die um eine
|
\lstinputlisting{mleslope.m}
|
||||||
Ursprungsgerade mit vorgegebener Steigung streuen. Berechne mit der
|
\end{solution}
|
||||||
Funktion aus \pref{mleslopefunc} die Steigung aus den Daten,
|
|
||||||
vergleiche mit der wahren Steigung, und plotte die urspr\"ungliche
|
\part Write a script that generates data pairs that scatter around a
|
||||||
sowie die gefittete Gerade zusammen mit den Daten.
|
line through the origin with a given slope. Use the function from
|
||||||
|
\pref{mleslopefunc} to compute the slope from the generated data.
|
||||||
\part
|
Compare the computed slope with the true slope that has been used to
|
||||||
Ver\"andere die Anzahl der Datenpunkte, die Steigung, sowie die
|
generate the data. Plot the data togehther with the line from which
|
||||||
Streuung der Daten um die Gerade.
|
the data were generated and the maximum-likelihood fit.
|
||||||
|
\begin{solution}
|
||||||
|
\lstinputlisting{mlepropfit.m}
|
||||||
|
\includegraphics[width=1\textwidth]{mlepropfit}
|
||||||
|
\end{solution}
|
||||||
|
|
||||||
|
\part \label{mleslopecomp} Vary the number of data pairs, the slope,
|
||||||
|
as well as the variance of the data points around the true
|
||||||
|
line. Under which conditions is the maximum-likelihood estimation of
|
||||||
|
the slope closer to the true slope?
|
||||||
|
|
||||||
|
\part To answer \pref{mleslopecomp} more precisely, generate for
|
||||||
|
each condition let's say 1000 data sets and plot a histogram of the
|
||||||
|
estimated slopes. How does the histogram, its mean and standard
|
||||||
|
deviation relate to the true slope?
|
||||||
\end{parts}
|
\end{parts}
|
||||||
\begin{solution}
|
\begin{solution}
|
||||||
\lstinputlisting{mleslope.m}
|
\lstinputlisting{mlepropest.m}
|
||||||
\lstinputlisting{mlepropfit.m}
|
\includegraphics[width=1\textwidth]{mlepropest}
|
||||||
\includegraphics[width=1\textwidth]{mlepropfit}
|
|
||||||
\end{solution}
|
\end{solution}
|
||||||
|
|
||||||
|
\continue
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\question \qt{Maximum-Likelihood-Sch\"atzer einer Wahrscheinlichkeitsdichtefunktion}
|
\question \qt{Maximum-likelihood-estimation of a probability-density function}
|
||||||
Verschiedene Wahrscheinlichkeitsdichtefunktionen haben Parameter, die
|
Many probability-density functions have parameters that cannot be
|
||||||
nicht so einfach wie der Mittelwert und die Standardabweichung einer
|
computed directly from the data, like, for example, the mean of
|
||||||
Normalverteilung direkt aus den Daten berechnet werden k\"onnen. Solche Parameter
|
normally-distributed data. Such parameter need to be estimated by
|
||||||
m\"ussen dann aus den Daten mit der Maximum-Likelihood-Methode gefittet werden.
|
means of the maximum-likelihood from the data.
|
||||||
|
|
||||||
Um dies zu veranschaulichen ziehen wir uns diesmal nicht normalverteilte Zufallszahlen, sondern Zufallszahlen aus der Gamma-Verteilung.
|
Let us demonstrate this approach by means of data that are drawn from a
|
||||||
|
gamma distribution,
|
||||||
\begin{parts}
|
\begin{parts}
|
||||||
\part
|
\part Find out which \code{matlab} function computes the
|
||||||
Finde heraus welche \code{matlab} Funktion die
|
probability-density function of the gamma distribution.
|
||||||
Wahrscheinlichkeitsdichtefunktion (probability density function) der
|
|
||||||
Gamma-Verteilung berechnet.
|
\part \label{gammaplot} Use this function to plot the
|
||||||
|
probability-density function of the gamma distribution for various
|
||||||
\part
|
values of the (positive) ``shape'' parameter. Wet set the ``scale''
|
||||||
Plotte mit Hilfe dieser Funktion die Wahrscheinlichkeitsdichtefunktion
|
parameter to one.
|
||||||
der Gamma-Verteilung f\"ur verschiedene Werte des (positiven) ``shape'' Parameters.
|
|
||||||
Den ``scale'' Parameter setzen wir auf Eins.
|
\part Find out which \code{matlab} function generates random numbers
|
||||||
|
that are distributed according to a gamma distribution. Generate
|
||||||
\part
|
with this function 50 random numbers using one of the values of the
|
||||||
Finde heraus mit welcher Funktion Gammaverteilte Zufallszahlen in
|
``shape'' parameter used in \pref{gammaplot}.
|
||||||
\code{matlab} gezogen werden k\"onnen. Erzeuge mit dieser Funktion
|
|
||||||
50 Zufallszahlen mit einem der oben geplotteten ``shape'' Parameter.
|
\part Compute and plot a properly normalized histogram of these
|
||||||
|
random numbers.
|
||||||
\part
|
|
||||||
Berechne und plotte ein normiertes Histogramm dieser Zufallszahlen.
|
\part Find out which \code{matlab} function fit a distribution to a
|
||||||
|
vector of random numbers according to the maximum-likelihood method.
|
||||||
\part
|
How do you need to use this function in order to fit a gamma
|
||||||
Finde heraus mit welcher \code{matlab}-Funktion eine beliebige
|
distribution to the data?
|
||||||
Verteilung (``distribution'') an die Zufallszahlen nach der
|
|
||||||
Maximum-Likelihood Methode gefittet werden kann. Wie wird diese
|
\part Estimate with this function the parameter of the gamma
|
||||||
Funktion benutzt, um die Gammaverteilung an die Daten zu fitten?
|
distribution used to generate the data.
|
||||||
|
|
||||||
\part
|
\part Finally, plot the fitted gamma distribution on top of the
|
||||||
Bestimme mit dieser Funktion die Parameter der Gammaverteilung aus
|
normalized histogram of the data.
|
||||||
den Zufallszahlen.
|
|
||||||
|
|
||||||
\part
|
|
||||||
Plotte anschlie{\ss}end die Gammaverteilung mit den gefitteten
|
|
||||||
Parametern.
|
|
||||||
\end{parts}
|
\end{parts}
|
||||||
\begin{solution}
|
\begin{solution}
|
||||||
\lstinputlisting{mlepdffit.m}
|
\lstinputlisting{mlepdffit.m}
|
||||||
|
25
likelihood/exercises/mlepropest.m
Normal file
25
likelihood/exercises/mlepropest.m
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
m = 2.0; % slope
|
||||||
|
sigmas = [0.1, 1.0]; % standard deviations
|
||||||
|
ns = [100, 1000]; % number of data pairs
|
||||||
|
trials = 1000; % number of data sets
|
||||||
|
|
||||||
|
for i = 1:length(sigmas)
|
||||||
|
sigma = sigmas(i);
|
||||||
|
for j = 1:length(ns)
|
||||||
|
n = ns(j);
|
||||||
|
slopes = zeros(trials, 1);
|
||||||
|
for k=1:trials
|
||||||
|
% data pairs:
|
||||||
|
x = 5.0*rand(n, 1);
|
||||||
|
y = m*x + sigma*randn(n, 1);
|
||||||
|
% fit:
|
||||||
|
slopes(k) = mleslope(x, y);
|
||||||
|
end
|
||||||
|
subplot(2, 2, 2*(i-1)+j);
|
||||||
|
bins = [1.9:0.005:2.1];
|
||||||
|
hist(slopes, bins);
|
||||||
|
title(sprintf('sigma=%g, n=%d', sigma, n));
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
savefigpdf(gcf, 'mlepropest.pdf', 12, 7);
|
@ -1,30 +1,35 @@
|
|||||||
% draw random numbers:
|
|
||||||
n = 50;
|
|
||||||
mu = 3.0;
|
mu = 3.0;
|
||||||
sigma =2.0;
|
sigma =2.0;
|
||||||
x = randn(n,1)*sigma+mu;
|
ns = [50, 1000];
|
||||||
fprintf(' mean of the data is %.2f\n', mean(x))
|
for k = 1:length(ns)
|
||||||
fprintf('standard deviation of the data is %.2f\n', std(x))
|
n = ns(k);
|
||||||
|
% draw random numbers:
|
||||||
|
x = randn(n,1)*sigma+mu;
|
||||||
|
fprintf(' mean of the data is %.2f\n', mean(x))
|
||||||
|
fprintf('standard deviation of the data is %.2f\n', std(x))
|
||||||
|
|
||||||
% standard deviation as parameter:
|
% standard deviation as parameter:
|
||||||
psigs = 1.0:0.01:3.0;
|
psigs = 1.0:0.01:3.0;
|
||||||
% matrix with the probabilities for each x and psigs:
|
% matrix with the probabilities for each x and psigs:
|
||||||
lms = zeros(length(x), length(psigs));
|
lms = zeros(length(x), length(psigs));
|
||||||
for i=1:length(psigs)
|
for i=1:length(psigs)
|
||||||
psig = psigs(i);
|
psig = psigs(i);
|
||||||
p = exp(-0.5*((x-mu)/psig).^2.0)/sqrt(2.0*pi)/psig;
|
p = exp(-0.5*((x-mu)/psig).^2.0)/sqrt(2.0*pi)/psig;
|
||||||
lms(:,i) = p;
|
lms(:,i) = p;
|
||||||
end
|
end
|
||||||
lm = prod(lms, 1); % likelihood
|
lm = prod(lms, 1); % likelihood
|
||||||
loglm = sum(log(lms), 1); % log likelihood
|
loglm = sum(log(lms), 1); % log likelihood
|
||||||
|
|
||||||
% plot likelihood of standard deviation:
|
% plot likelihood of standard deviation:
|
||||||
subplot(1, 2, 1);
|
subplot(2, 2, 2*k-1);
|
||||||
plot(psigs, lm );
|
plot(psigs, lm );
|
||||||
xlabel('standard deviation')
|
title(sprintf('likelihood n=%d', n));
|
||||||
ylabel('likelihood')
|
xlabel('standard deviation')
|
||||||
subplot(1, 2, 2);
|
ylabel('likelihood')
|
||||||
plot(psigs, loglm);
|
subplot(2, 2, 2*k);
|
||||||
xlabel('standard deviation')
|
plot(psigs, loglm);
|
||||||
ylabel('log likelihood')
|
title(sprintf('log-likelihood n=%d', n));
|
||||||
|
xlabel('standard deviation')
|
||||||
|
ylabel('log likelihood')
|
||||||
|
end
|
||||||
savefigpdf(gcf, 'mlestd.pdf', 15, 5);
|
savefigpdf(gcf, 'mlestd.pdf', 15, 5);
|
||||||
|
Reference in New Issue
Block a user