[likelihood] 1st version of translation
This commit is contained in:
		
							parent
							
								
									87f52022c9
								
							
						
					
					
						commit
						413ccf22b3
					
				| @ -17,7 +17,7 @@ x_n$ originating from the distribution. | |||||||
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | ||||||
| \section{Maximum Likelihood} | \section{Maximum Likelihood} | ||||||
| 
 | 
 | ||||||
| Let $p(x|\theta)$ (to be read as ``Probability(density) of $x$ given | Let $p(x|\theta)$ (to be read as ``probability(density) of $x$ given | ||||||
| $\theta$.'') the probability (density) distribution of $x$ given the | $\theta$.'') the probability (density) distribution of $x$ given the | ||||||
| parameters $\theta$. This could be the normal distribution | parameters $\theta$. This could be the normal distribution | ||||||
| \begin{equation} | \begin{equation} | ||||||
| @ -68,118 +68,119 @@ the likelihood (\enterm{log-likelihood}): | |||||||
| 
 | 
 | ||||||
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | ||||||
| \subsection{Example: the arithmetic mean} | \subsection{Example: the arithmetic mean} | ||||||
| 
 | Suppose that the measurements $x_1, x_2, \ldots x_n$ originate from a | ||||||
| Wenn die Me{\ss}daten $x_1, x_2, \ldots x_n$ der Normalverteilung | normal distribution \eqnref{normpdfmean} and we consider the mean | ||||||
| \eqnref{normpdfmean} entstammen, und wir den Mittelwert $\mu=\theta$ als | $\mu=\theta$ as the only parameter.  Which value of $\theta$ maximizes | ||||||
| einzigen Parameter der Verteilung betrachten, welcher Wert von | its likelihood? | ||||||
| $\theta$ maximiert dessen Likelihood? |  | ||||||
| 
 | 
 | ||||||
| \begin{figure}[t] | \begin{figure}[t] | ||||||
|   \includegraphics[width=1\textwidth]{mlemean} |   \includegraphics[width=1\textwidth]{mlemean} | ||||||
|   \titlecaption{\label{mlemeanfig} Maximum Likelihood Sch\"atzung des |   \titlecaption{\label{mlemeanfig} Maximum likelihood estimation of | ||||||
|     Mittelwerts.}{Oben: Die Daten zusammen mit drei m\"oglichen |     the mean.}{Top: The measured data (blue dots) together with three | ||||||
|     Normalverteilungen mit unterschiedlichen Mittelwerten (Pfeile) aus |     different possible normal distributions with different means | ||||||
|     denen die Daten stammen k\"onnten.  Unten links: Die Likelihood |     (arrows) that could be the origin of the data.  Bootom left: the | ||||||
|     in Abh\"angigkeit des Mittelwerts als Parameter der |     likelihood as a function of $\theta$ i.e. the mean. It is maximal | ||||||
|     Normalverteilungen. Unten rechts: die entsprechende |     at a value of $\theta = 2$. Bottom right: the | ||||||
|     Log-Likelihood. An der Position des Maximums bei $\theta=2$ |     log-likelihood. Taking the logarithm does not change the position | ||||||
|     \"andert sich nichts (Pfeil).} |     of the maximum.} | ||||||
| \end{figure} | \end{figure} | ||||||
| 
 | The log-likelihood \eqnref{loglikelihood} | ||||||
| Die Log-Likelihood \eqnref{loglikelihood} ist |  | ||||||
| \begin{eqnarray*} | \begin{eqnarray*} | ||||||
|   \log {\cal L}(\theta|x_1,x_2, \ldots x_n) |   \log {\cal L}(\theta|x_1,x_2, \ldots x_n) | ||||||
|   & = & \sum_{i=1}^n \log \frac{1}{\sqrt{2\pi \sigma^2}}e^{-\frac{(x_i-\theta)^2}{2\sigma^2}} \\ |   & = & \sum_{i=1}^n \log \frac{1}{\sqrt{2\pi \sigma^2}}e^{-\frac{(x_i-\theta)^2}{2\sigma^2}} \\ | ||||||
|   & = & \sum_{i=1}^n - \log \sqrt{2\pi \sigma^2} -\frac{(x_i-\theta)^2}{2\sigma^2} \; . |   & = & \sum_{i=1}^n - \log \sqrt{2\pi \sigma^2} -\frac{(x_i-\theta)^2}{2\sigma^2} \; . | ||||||
| \end{eqnarray*} | \end{eqnarray*} | ||||||
| Der Logarithmus hat die sch\"one Eigenschaft, die Exponentialfunktion | % FIXME do we need parentheses around the normal distribution in line one? | ||||||
| der Normalverteilung auszul\"oschen, da der Logarithmus die | Since the logarithm is the inverse function of the exponential | ||||||
| Umkehrfunktion der Exponentialfunktion ist ($\log(e^x)=x$). | ($\log(e^x)=x$), taking the logarithm removes the exponential from the | ||||||
| 
 | normal distribution.  To calculate the maximum of the log-likelihood, | ||||||
| Zur Bestimmung des Maximums der Log-Likelihood berechnen wir deren Ableitung | we need to take the derivative with respect to $\theta$ and set it to | ||||||
| nach dem Parameter $\theta$ und setzen diese gleich Null:  | zero: | ||||||
| \begin{eqnarray*} | \begin{eqnarray*} | ||||||
|   \frac{\text{d}}{\text{d}\theta} \log {\cal L}(\theta|x_1,x_2, \ldots x_n) & = & \sum_{i=1}^n - \frac{2(x_i-\theta)}{2\sigma^2} \;\; = \;\; 0 \\ |   \frac{\text{d}}{\text{d}\theta} \log {\cal L}(\theta|x_1,x_2, \ldots x_n) & = & \sum_{i=1}^n - \frac{2(x_i-\theta)}{2\sigma^2} \;\; = \;\; 0 \\ | ||||||
|   \Leftrightarrow \quad \sum_{i=1}^n x_i - \sum_{i=1}^n \theta & = & 0 \\ |   \Leftrightarrow \quad \sum_{i=1}^n x_i - \sum_{i=1}^n \theta & = & 0 \\ | ||||||
|   \Leftrightarrow \quad n \theta & = & \sum_{i=1}^n x_i \\ |   \Leftrightarrow \quad n \theta & = & \sum_{i=1}^n x_i \\ | ||||||
|   \Leftrightarrow \quad \theta & = & \frac{1}{n} \sum_{i=1}^n x_i \;\; = \;\; \bar x |   \Leftrightarrow \quad \theta & = & \frac{1}{n} \sum_{i=1}^n x_i \;\; = \;\; \bar x | ||||||
| \end{eqnarray*} | \end{eqnarray*} | ||||||
| Der Maximum-Likelihood-Sch\"atzer ist das arithmetische Mittel $\bar | From the above equations it becomes clear that the maximum likelihood | ||||||
| x$ der Daten. D.h. das arithmetische Mittel maximiert die | estimation is equivalent to the mean of the data. That is, the | ||||||
| Wahrscheinlichkeit, dass die Daten aus einer Normalverteilung mit | assuming the mean of the data as $\theta$ maximizes the likelihood | ||||||
| diesem Mittelwert gezogen worden sind (\figref{mlemeanfig}). | that the data originate from a normal distribution with that mean | ||||||
|  | (\figref{mlemeanfig}). | ||||||
|  | 
 | ||||||
| 
 | 
 | ||||||
| \begin{exercise}{mlemean.m}{mlemean.out} | \begin{exercise}{mlemean.m}{mlemean.out} | ||||||
|   Ziehe $n=50$ normalverteilte Zufallsvariablen mit einem Mittelwert $\ne 0$ |   Draw $n=50$ random numbers from a normal distribution with a mean of | ||||||
|   und einer Standardabweichung $\ne 1$. |   $\ne 0$ and a standard deviation of $\ne 1$. | ||||||
| 
 | 
 | ||||||
|   Plotte die Likelihood (aus dem Produkt der Wahrscheinlichkeiten) und |   Plot the likelihood (the product of the probabilities) and the | ||||||
|   die Log-Likelihood (aus der Summe der logarithmierten |   log-likelihood (given by the sum of the logarithms of the | ||||||
|   Wahrscheinlichkeiten) f\"ur den Mittelwert als Parameter. Vergleiche |   probabilities) for the mean as parameter. Compare the position of | ||||||
|   die Position der Maxima mit dem aus den Daten berechneten |   the maxima with the mean calculated from the data. | ||||||
|   Mittelwert. |  | ||||||
|   \pagebreak[4] |   \pagebreak[4] | ||||||
| \end{exercise} | \end{exercise} | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| \pagebreak[4] | \pagebreak[4] | ||||||
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | ||||||
| \section{Kurvenfit als Maximum-Likelihood Sch\"atzung} | \section{Curve fitting as using maximum-likelihood estimation} | ||||||
| Beim \determ{Kurvenfit} soll eine Funktion $f(x;\theta)$ mit den Parametern | 
 | ||||||
| $\theta$ an die Datenpaare $(x_i|y_i)$ durch Anpassung der Parameter | During curve fitting a function of the form $f(x;\theta)$ with the | ||||||
| $\theta$ gefittet werden. Wenn wir annehmen, dass die $y_i$ um die | parameter $\theta$ is adapted to the data pairs $(x_i|y_i)$ by | ||||||
| entsprechenden Funktionswerte $f(x_i;\theta)$ mit einer | adapting $\theta$. When we assume that the $y_i$ values are normally | ||||||
| Standardabweichung $\sigma_i$ normalverteilt streuen, dann lautet die | distributed around the function values $f(x_i;\theta)$ with a standard | ||||||
| Log-Likelihood | deviation $\sigma_i$, the log-likelihood is | ||||||
|  | 
 | ||||||
| \begin{eqnarray*} | \begin{eqnarray*} | ||||||
|   \log {\cal L}(\theta|(x_1,y_1,\sigma_1), \ldots, (x_n,y_n,\sigma_n)) |   \log {\cal L}(\theta|(x_1,y_1,\sigma_1), \ldots, (x_n,y_n,\sigma_n)) | ||||||
|   & = & \sum_{i=1}^n \log \frac{1}{\sqrt{2\pi \sigma_i^2}}e^{-\frac{(y_i-f(x_i;\theta))^2}{2\sigma_i^2}} \\ |   & = & \sum_{i=1}^n \log \frac{1}{\sqrt{2\pi \sigma_i^2}}e^{-\frac{(y_i-f(x_i;\theta))^2}{2\sigma_i^2}} \\ | ||||||
|   & = & \sum_{i=1}^n - \log \sqrt{2\pi \sigma_i^2} -\frac{(y_i-f(x_i;\theta))^2}{2\sigma_i^2} \\ |   & = & \sum_{i=1}^n - \log \sqrt{2\pi \sigma_i^2} -\frac{(y_i-f(x_i;\theta))^2}{2\sigma_i^2} \\ | ||||||
| \end{eqnarray*} | \end{eqnarray*} | ||||||
| Der einzige Unterschied zum vorherigen Beispiel ist, dass die | The only difference to the previous example is that the averages in | ||||||
| Mittelwerte der Normalverteilungen nun durch die Funktionswerte | the equations above are now given as the function values | ||||||
| gegeben sind. | $f(x_i;\theta)$ | ||||||
| 
 | 
 | ||||||
| Der Parameter $\theta$ soll so gew\"ahlt werden, dass die | The parameter $\theta$ should be the one that maximizes the | ||||||
| Log-Likelihood maximal wird.  Der erste Term der Summe ist | log-likelihood. The first part of the sum is independent of $\theta$ | ||||||
| unabh\"angig von $\theta$ und kann deshalb bei der Suche nach dem | and can thus be ignored during the search of the maximum: | ||||||
| Maximum weggelassen werden: |  | ||||||
| \begin{eqnarray*} | \begin{eqnarray*} | ||||||
|   & = & - \frac{1}{2} \sum_{i=1}^n \left( \frac{y_i-f(x_i;\theta)}{\sigma_i} \right)^2 |   & = & - \frac{1}{2} \sum_{i=1}^n \left( \frac{y_i-f(x_i;\theta)}{\sigma_i} \right)^2 | ||||||
| \end{eqnarray*} | \end{eqnarray*} | ||||||
| Anstatt nach dem Maximum zu suchen, k\"onnen wir auch das Vorzeichen der Log-Likelihood | We can further simplify by inverting the sign and then search for the | ||||||
| umdrehen und nach dem Minimum suchen. Dabei k\"onnen wir auch den Faktor $1/2$ vor der Summe vernachl\"assigen --- auch das \"andert nichts an der Position des Minimums: | minimum. Also the $1/2$ factor can be ignored since it does not affect | ||||||
|  | the position of the minimum: | ||||||
| \begin{equation} | \begin{equation} | ||||||
|   \label{chisqmin} |   \label{chisqmin} | ||||||
|   \theta_{mle} = \text{argmin}_{\theta} \; \sum_{i=1}^n \left( \frac{y_i-f(x_i;\theta)}{\sigma_i} \right)^2 \;\; = \;\; \text{argmin}_{\theta} \; \chi^2 |   \theta_{mle} = \text{argmin}_{\theta} \; \sum_{i=1}^n \left( \frac{y_i-f(x_i;\theta)}{\sigma_i} \right)^2 \;\; = \;\; \text{argmin}_{\theta} \; \chi^2 | ||||||
| \end{equation} | \end{equation} | ||||||
| Die Summe der quadratischen Abst\"ande normiert auf die jeweiligen | The sum of the squared differences when normalized to the standard | ||||||
| Standardabweichungen wird auch mit $\chi^2$ bezeichnet. Der Wert des | deviation is also called $\chi^2$. The parameter $\theta$ which | ||||||
| Parameters $\theta$, welcher den quadratischen Abstand minimiert, ist | minimizes the squared differences is thus the one that maximizes the | ||||||
| also identisch mit der Maximierung der Wahrscheinlichkeit, dass die | probability that the data actually originate from the given | ||||||
| Daten tats\"achlich aus der Funktion stammen k\"onnen. Minimierung des | function. Minimizing $\chi^2$ therefore is a maximum likelihood | ||||||
| $\chi^2$ ist also eine Maximum-Likelihood Sch\"atzung.  | estimation. | ||||||
| 
 | 
 | ||||||
| An der Herleitung sehen wir aber auch, dass die Minimierung des | From the mathematical considerations above we can see that the | ||||||
| quadratischen Abstands nur dann eine Maximum-Likelihood Absch\"atzung | minimization of the squared difference is a maximum-likelihood | ||||||
| ist, wenn die Daten normalverteilt um die Funktion streuen. Bei | estimation only if the data are normally distributed around the | ||||||
| anderen Verteilungen m\"usste man die Log-Likelihood entsprechend | function. In case of other distributions, the log-likelihood needs to | ||||||
| \eqnref{loglikelihood} ausrechnen und maximieren. | be adapted accordingly \eqnref{loglikelihood} and be maximized | ||||||
|  | respectively. | ||||||
| 
 | 
 | ||||||
| \begin{figure}[t] | \begin{figure}[t] | ||||||
|   \includegraphics[width=1\textwidth]{mlepropline} |   \includegraphics[width=1\textwidth]{mlepropline} | ||||||
|   \titlecaption{\label{mleproplinefig} Maximum-Likelihood Sch\"atzung der |   \titlecaption{\label{mleproplinefig} Maximum likelihood estimation | ||||||
|     Steigung einer Ursprungsgeraden.}{} |     of the slope of line through the origin.}{} | ||||||
| \end{figure} | \end{figure} | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| \subsection{Beispiel: einfache Proportionalit\"at} | \subsection{Example: simple proportionality} | ||||||
| Als Funktion nehmen wir die Ursprungsgerade | The function of a line going through the origin | ||||||
| \[ f(x) = \theta x  \] | \[ f(x) = \theta x  \] | ||||||
| mit Steigung $\theta$. Die $\chi^2$-Summe lautet damit | with the slope $\theta$. The $\chi^2$-sum is thus | ||||||
| \[ \chi^2 = \sum_{i=1}^n \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \; . \] | \[ \chi^2 = \sum_{i=1}^n \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \; . \] | ||||||
| Zur Bestimmung des Minimums berechnen wir wieder die erste Ableitung nach $\theta$ | To estimate the minimum we again take the first derivative with | ||||||
| und setzen diese gleich Null: | respect to $\theta$ and equate it to zero: | ||||||
| \begin{eqnarray} | \begin{eqnarray} | ||||||
|   \frac{\text{d}}{\text{d}\theta}\chi^2 & = & \frac{\text{d}}{\text{d}\theta} \sum_{i=1}^n \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \nonumber \\ |   \frac{\text{d}}{\text{d}\theta}\chi^2 & = & \frac{\text{d}}{\text{d}\theta} \sum_{i=1}^n \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \nonumber \\ | ||||||
|   & = & \sum_{i=1}^n \frac{\text{d}}{\text{d}\theta} \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \nonumber \\ |   & = & \sum_{i=1}^n \frac{\text{d}}{\text{d}\theta} \left( \frac{y_i-\theta x_i}{\sigma_i} \right)^2 \nonumber \\ | ||||||
| @ -188,123 +189,124 @@ und setzen diese gleich Null: | |||||||
| \Leftrightarrow \quad  \theta \sum_{i=1}^n \frac{x_i^2}{\sigma_i^2} & = & \sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2} \nonumber \\ | \Leftrightarrow \quad  \theta \sum_{i=1}^n \frac{x_i^2}{\sigma_i^2} & = & \sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2} \nonumber \\ | ||||||
| \Leftrightarrow \quad  \theta & = & \frac{\sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2}}{ \sum_{i=1}^n \frac{x_i^2}{\sigma_i^2}} \label{mleslope} | \Leftrightarrow \quad  \theta & = & \frac{\sum_{i=1}^n \frac{x_iy_i}{\sigma_i^2}}{ \sum_{i=1}^n \frac{x_i^2}{\sigma_i^2}} \label{mleslope} | ||||||
| \end{eqnarray} | \end{eqnarray} | ||||||
| Damit haben wir nun einen anlytischen Ausdruck f\"ur die Bestimmung | With this we obtained an analytical expression for the estimation of | ||||||
| der Steigung $\theta$ der Regressionsgeraden gewonnen | the slope $\theta$ of the regression line (\figref{mleproplinefig}). | ||||||
| (\figref{mleproplinefig}). |  | ||||||
| 
 | 
 | ||||||
| Ein Gradientenabstieg ist f\"ur das Fitten der Geradensteigung also | A gradient descent, as we have done in the previous chapter, is thus | ||||||
| gar nicht n\"otig. Das gilt allgemein f\"ur das Fitten von | not necessary for the fitting of the slope of a linear equation. This | ||||||
| Koeffizienten von linear kombinierten Basisfunktionen. Wie z.B. | holds even more generally for fitting the coefficients of linearly | ||||||
| die Steigung $m$ und der y-Achsenabschnitt $b$ einer Geradengleichung | combined basis functions as for example the fitting of the slope $m$ | ||||||
|  | and the y-intercept $b$ of the linear equation | ||||||
| \[ y = m \cdot x +b \] | \[ y = m \cdot x +b \] | ||||||
| oder allgemeiner die Koeffizienten $a_k$ eines Polynoms | or, more generally, the coefficients $a_k$ of a polynom | ||||||
| \[ y = \sum_{k=0}^N a_k x^k = a_o + a_1x + a_2x^2 + a_3x^4 + \ldots \] | \[ y = \sum_{k=0}^N a_k x^k = a_o + a_1x + a_2x^2 + a_3x^4 + \ldots \] | ||||||
| \matlabfun{polyfit()}. | \matlabfun{polyfit()}. | ||||||
| 
 | 
 | ||||||
| Parameter, die nichtlinear in einer Funktion enthalten sind, k\"onnen | Parameters that are non-linearly combined can not be calculated | ||||||
| im Gegensatz dazu nicht analytisch aus den Daten berechnet | analytically. Consider for example the rate $\lambda$ of the | ||||||
| werden. z.B. die Rate $\lambda$ eines exponentiellen Zerfalls | exponential decay | ||||||
| \[ y = c \cdot e^{\lambda x} \quad , \quad c, \lambda \in \reZ \; . \] | \[ y = c \cdot e^{\lambda x} \quad , \quad c, \lambda \in \reZ \; . \] | ||||||
|  F\"ur diesen Fall bleibt dann nur auf numerische Verfahren zur | Such cases require numerical solutions for the optimization of the | ||||||
| Optimierung der Kostenfunktion, wie z.B. der Gradientenabstieg, | cost function, e.g. the gradient descent \matlabfun{lsqcurvefit()}. | ||||||
| zur\"uckzugreifen \matlabfun{lsqcurvefit()}. |  | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | ||||||
| \section{Fits von Wahrscheinlichkeitsverteilungen} | \section{Fits von Wahrscheinlichkeitsverteilungen} | ||||||
| Jetzt betrachten wir noch den Fall, bei dem wir die Parameter einer | Finally let's consider the case in which we want to fit the parameters | ||||||
| Wahrscheinlichkeitsdichtefunktion (z.B. den shape-Parameter einer | of a probability density function (e.g. the shape parameter of a | ||||||
| \determ{Gamma-Verteilung}) an ein Datenset fitten wollen. | \enterm{Gamma-distribution}) to a dataset. | ||||||
| 
 | 
 | ||||||
| Ein erster Gedanke k\"onnte sein, die | A first guess could be to fit the probability density by minimization | ||||||
| \determ[Wahrscheinlichkeitsdichte]{Wahrscheinlichkeitsdichtefunktion} | of the squared difference to a histogram of the measured data. For | ||||||
| durch Minimierung des quadratischen Abstands an ein Histogramm der | several reasons this is, however, not the method of choice: (i) | ||||||
| Daten zu fitten. Das ist aber aus folgenden Gr\"unden nicht die | probability densities can only be positive which leads, for small | ||||||
| Methode der Wahl: (i) Wahrscheinlichkeitsdichten k\"onnen nur positiv | values in particular, to asymmetric distributions. (ii) the values of | ||||||
| sein. Darum k\"onnen insbesondere bei kleinen Werten die Daten nicht | a histogram are not independent because the integral of a density is | ||||||
| symmetrisch streuen, wie es bei normalverteilten Daten der Fall | unity. The two basic assumptions of normally distributed and | ||||||
| ist. (ii) Die Datenwerte sind nicht unabh\"angig, da das normierte | independent samples, which are a prerequisite make the minimization of | ||||||
| Histogram sich zu Eins aufintegriert. Die beiden Annahmen | the squared difference \eqnref{chisqmin} to a maximum likelihood | ||||||
| normalverteilte und unabh\"angige Daten, die die Minimierung des | estimation, are violated. (iii) The histogram strongly depends on the | ||||||
| quadratischen Abstands \eqnref{chisqmin} zu einem Maximum-Likelihood | chosen bin size \figref{mlepdffig}). | ||||||
| Sch\"atzer machen, sind also verletzt. (iii) Das Histogramm h\"angt |  | ||||||
| von der Wahl der Klassenbreite ab (\figref{mlepdffig}). |  | ||||||
| 
 | 
 | ||||||
| \begin{figure}[t] | \begin{figure}[t] | ||||||
|   \includegraphics[width=1\textwidth]{mlepdf} |   \includegraphics[width=1\textwidth]{mlepdf} | ||||||
|   \titlecaption{\label{mlepdffig} Maximum-Likelihood Sch\"atzung einer |   \titlecaption{\label{mlepdffig} Maximum likelihood estimation of a | ||||||
|     Wahrscheinlichkeitsdichtefunktion.}{Links: die 100 Datenpunkte, die |     probability density.}{Left: the 100 data points drawn from a 2nd | ||||||
|     aus der Gammaverteilung 2. Ordnung (rot) gezogen worden sind. Der |     order Gamma-distribution. The maximum likelihood estimation of the | ||||||
|     Maximum-Likelihood-Fit ist orange dargestellt.  Rechts: das |     probability density function is shown in orange, the true pdf is | ||||||
|     normierte Histogramm der Daten zusammen mit dem \"uber Minimierung |     shown in red. Right: normalized histogram of the data together | ||||||
|     des quadratischen Abstands zum Histogramm berechneten Fit.} |     with the real (red) and the fitted probability density | ||||||
|  |     functions. The fit was done by minimizing the squared difference | ||||||
|  |     to the histogram.} | ||||||
| \end{figure} | \end{figure} | ||||||
| 
 | 
 | ||||||
| Den direkten Weg, eine Wahrscheinlichkeitsdichtefunktion an ein | 
 | ||||||
| Datenset zu fitten, haben wir oben schon bei dem Beispiel zur | Using the example of the estimating the mean value of a normal | ||||||
| Absch\"atzung des Mittelwertes einer Normalverteilung gesehen --- | distribution we have discussed the direct approach to fit a | ||||||
| Maximum Likelihood! Wir suchen einfach die Parameter $\theta$ der | probability density to data via maximum likelihood. We simply search | ||||||
| gesuchten Wahrscheinlichkeitsdichtefunktion bei der die Log-Likelihood | for the parameter $\theta$ of the desired probability density function | ||||||
| \eqnref{loglikelihood} maximal wird. Das ist im allgemeinen ein | that maximizes the log-likelihood. This is a non-linear optimization | ||||||
| nichtlinieares Optimierungsproblem, das mit numerischen Verfahren, wie | problem that is generally solved with numerical methods such as the | ||||||
| z.B. dem Gradientenabstieg, gel\"ost wird \matlabfun{mle()}. | gradient descent \matlabfun{mle()}. | ||||||
| 
 | 
 | ||||||
| \begin{exercise}{mlegammafit.m}{mlegammafit.out} | \begin{exercise}{mlegammafit.m}{mlegammafit.out} | ||||||
|   Erzeuge Gammaverteilte Zufallszahlen und benutze Maximum-Likelihood, |   Create a sample of gamma-distributed random number and apply the | ||||||
|   um die Parameter der Gammafunktion aus den Daten zu bestimmen. |   maximum likelihood method to estimate the parameters of the gamma | ||||||
|  |   function from the data. | ||||||
|   \pagebreak |   \pagebreak | ||||||
| \end{exercise} | \end{exercise} | ||||||
| 
 | 
 | ||||||
| 
 | 
 | ||||||
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% | ||||||
| \section{Neuronale Kodierung} | \section{Neural coding} | ||||||
| In sensorischen Systemen kodieren Populationen von Neuronen mit ihrer | In sensory systems certain aspects of the surrounding are encoded in | ||||||
| Aktivit\"at Eigenschaften von sensorischen Stimuli. z.B. im visuellen | the neuronal activity of populations of neurons. One example of such | ||||||
| Kortex V1 die Orientierung eines Balkens. Traditionell wird die | population coding is the tuning of neurons in the primary visual | ||||||
| Antwort der Neurone f\"ur verschiedene Stimuli (z.B. verschiedene | cortex (V1) to the orientation of a visual stimulus. Different neurons | ||||||
| Orientierungen des Balkens) gemessen. Die mittlere Antwort der Neurone | respond best to different stimulus orientations. Traditionally, such a | ||||||
| als Funktion eines Stimulusparameters ist dann die | tuning is measured by analyzing the neuronal response strength | ||||||
| \enterm{Tuning-curve} (deutsch \determ{Abstimmkurve}, z.B. Feuerrate | (e.g. the firing rate) as a function of the orientation of the visual | ||||||
| als Funktion des Orientierungswinkels). | stimulus and is depicted and summarized with the so called | ||||||
|  | \enterm{tuning-curve}(German \determ{Abstimmkurve}, | ||||||
|  | figure~\ref{mlecoding}, top). | ||||||
| 
 | 
 | ||||||
| \begin{figure}[tp] | \begin{figure}[tp] | ||||||
|   \includegraphics[width=1\textwidth]{mlecoding} |   \includegraphics[width=1\textwidth]{mlecoding} | ||||||
|   \titlecaption{\label{mlecodingfig} Maximum Likelihood Sch\"atzung |   \titlecaption{\label{mlecodingfig} Maximum likelihood estimation of | ||||||
|     eines Stimulusparameters aus neuronaler Aktivit\"at.}{Oben: |     a stimulus parameter from neuronal activity.}{Top: Tuning curve of | ||||||
|     Die Tuning-Kurve eines einzelnen Neurons in Abh\"angigkeit von der |     an individual neuron as a function of the stimulus orientation (a | ||||||
|     Orientierung eines Balkens. Der Stimulus der die st\"akste |     dark bar in front of a white background). The stimulus that evokes | ||||||
|     Aktivit\"at in diesem Neuron hervorruft ist ein senkrechter Balken |     the strongest activity in that neuron is the bar with the vertical | ||||||
|     (Pfeil, $\phi_i=90$\,\degree. Die rote Fl\"ache deutet die |     orientation (arrow, $\phi_i=90$\,\degree). The red area indicates | ||||||
|     Variabilit\"at $p(r)$ der Aktivit\"at $r$ um die Tuning-Kurve |     the variability of the neuronal activity $p(r)$ around the tunig | ||||||
|     herum an. Mitte: Jedes Neuron in der Population hat eine andere |     curve. Center: In a population of neurons, each neuron may have a | ||||||
|     bevorzugte Orientierung des Stimulus (farbige Linien).  Ein |     different tuning curve (colors). A specific stimulus (the vertical | ||||||
|     Stimulus einer bestimmten Orientierung aktiviert die Neurone in |     bar) activates the individual neurons of the population in a | ||||||
|     spezifischer Weise (Punkte). Unten: Die Log-Likelihood dieser |     specific way (dots). Bottom: The log-likelihood of the activity | ||||||
|     Aktivit\"aten wird in der N\"ahe der wahren Orientierung |     pattern will be maximized close to the real stimulus orientation.} | ||||||
|     des Stimulus maximiert.} |  | ||||||
| \end{figure} | \end{figure} | ||||||
| 
 | 
 | ||||||
| Das Gehirn ist aber mit dem umgekehrten Problem konfrontiert: gegeben | The brain, however, is confronted with the inverse problem: given a | ||||||
| eine bestimmte Aktivit\"at der Neurone in der Population, was war der | certain activity pattern in the neuronal population, what was the | ||||||
| Stimulus (die Orientierung des Balkens)? Eine m\"ogliche Antwort ist | stimulus? In the sense of maximum likelihood, a possible answer to | ||||||
| im Sinne von Maximum-Likelihood: es war der Stimulus f\"ur den das | this question would be: It was the stimulus for which the particular | ||||||
| Aktivit\"atsmuster am wahrscheinlichsten ist. | activity pattern is most likely. | ||||||
| 
 | 
 | ||||||
| Bleiben wir mit einem Beispiel bei den orientierungssensitiven Zellen | Let's stay with the example of the orientation tuning in V1. The | ||||||
| des V1. Das Tuning $\Omega_i(\phi)$ der Zellen $i$ auf ihre bevorzugte | tuning $\Omega_i(\phi)$ of the neurons $i$ to the preferred stimulus | ||||||
| Orientierung $\phi_i$ l\"asst sich gut mit einer van-Mises Funktion | orientation $\phi_i$ can be well described using a van-Mises function | ||||||
| (entspricht der Gaussfunktion auf einer zyklischen x-Achse) | (the Gaussian function on a cyclic x-axis) (\figref{mlecodingfig}): | ||||||
| beschreiben (\figref{mlecodingfig}): |  | ||||||
| \[ \Omega_i(\phi) = c \cdot e^{\cos(2(\phi-\phi_i))} \quad , \quad c | \[ \Omega_i(\phi) = c \cdot e^{\cos(2(\phi-\phi_i))} \quad , \quad c | ||||||
| \in \reZ \] | \in \reZ \] | ||||||
| Die Aktivit\"at der Neurone approximieren wir hier mit einer | The neuronal activity is approximated with a normal | ||||||
| Normalverteilung um die Tuning-Kurve mit Standardabweichung | distribution around the tuning curve with a standard deviation | ||||||
| $\sigma=\Omega/4$ proportional zu $\Omega$, so dass die | $\sigma=\Omega/4$ which is proprotional to $\Omega$ such that the | ||||||
| Wahrscheinlichkeit $p_i(r|\phi)$ des $i$-ten Neurons die Aktivit\"at $r$ zu | probability $p_i(r|\phi)$ of the $i$-th neuron showing the activity | ||||||
| haben, wenn ein Stimulus mit Orientierung $\phi$ anliegt, gegeben ist durch | $r$ given a certain orientation $\phi$ is given by | ||||||
|  | 
 | ||||||
| \[ p_i(r|\phi) = \frac{1}{\sqrt{2\pi}\Omega_i(\phi)/4} e^{-\frac{1}{2}\left(\frac{r-\Omega_i(\phi)}{\Omega_i(\phi)/4}\right)^2} \; . \] | \[ p_i(r|\phi) = \frac{1}{\sqrt{2\pi}\Omega_i(\phi)/4} e^{-\frac{1}{2}\left(\frac{r-\Omega_i(\phi)}{\Omega_i(\phi)/4}\right)^2} \; . \] | ||||||
| Die Log-Likelihood der Stimulusorientierung $\phi$ gegeben die | The log-likelihood of the stimulus orientation $\phi$ given the | ||||||
| Aktivit\"aten $r_1$, $r_2$, ... $r_n$ ist damit | activity pattern in the population $r_1$, $r_2$, ... $r_n$ is thus | ||||||
| \[ {\cal L}(\phi|r_1, r_2, \ldots r_n) = \sum_{i=1}^n \log p_i(r_i|\phi) \] | \[ {\cal L}(\phi|r_1, r_2, \ldots r_n) = \sum_{i=1}^n \log p_i(r_i|\phi) \] | ||||||
| 
 | 
 | ||||||
| \selectlanguage{english} | \selectlanguage{english} | ||||||
|  | |||||||
		Reference in New Issue
	
	Block a user