[plotting] add exercise
This commit is contained in:
parent
2fbbf04e98
commit
0d1f201bd8
@ -2,7 +2,7 @@
|
|||||||
|
|
||||||
\usepackage[german]{babel}
|
\usepackage[german]{babel}
|
||||||
\usepackage{pslatex}
|
\usepackage{pslatex}
|
||||||
\usepackage[mediumspace,mediumqspace,Gray]{SIunits} % \ohm, \micro
|
\usepackage[mediumspace,mediumqspace]{SIunits} % \ohm, \micro
|
||||||
\usepackage{xcolor}
|
\usepackage{xcolor}
|
||||||
\usepackage{graphicx}
|
\usepackage{graphicx}
|
||||||
\usepackage[breaklinks=true,bookmarks=true,bookmarksopen=true,pdfpagemode=UseNone,pdfstartview=FitH,colorlinks=true,citecolor=blue]{hyperref}
|
\usepackage[breaklinks=true,bookmarks=true,bookmarksopen=true,pdfpagemode=UseNone,pdfstartview=FitH,colorlinks=true,citecolor=blue]{hyperref}
|
||||||
@ -15,9 +15,9 @@
|
|||||||
\else
|
\else
|
||||||
\newcommand{\stitle}{}
|
\newcommand{\stitle}{}
|
||||||
\fi
|
\fi
|
||||||
\header{{\bfseries\large Exercise 12\stitle}}{{\bfseries\large Maximum likelihood}}{{\bfseries\large January 7th, 2019}}
|
\header{{\bfseries\large Exercise 7\stitle}}{{\bfseries\large Plotting}}{{\bfseries\large November 20, 2019}}
|
||||||
\firstpagefooter{Prof. Dr. Jan Benda}{Phone: 29 74573}{Email:
|
\firstpagefooter{Dr. Jan Grewe}{Phone: 29 74588}{Email:
|
||||||
jan.benda@uni-tuebingen.de}
|
jan.grewe@uni-tuebingen.de}
|
||||||
\runningfooter{}{\thepage}{}
|
\runningfooter{}{\thepage}{}
|
||||||
|
|
||||||
\setlength{\baselineskip}{15pt}
|
\setlength{\baselineskip}{15pt}
|
||||||
@ -85,129 +85,49 @@ jan.benda@uni-tuebingen.de}
|
|||||||
|
|
||||||
\input{instructions}
|
\input{instructions}
|
||||||
|
|
||||||
|
|
||||||
\begin{questions}
|
\begin{questions}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
||||||
\question \qt{Maximum likelihood of the standard deviation}
|
\question \qt{Graphical display of behavioral data.} In this task
|
||||||
Let's compute the likelihood and the log-likelihood for the estimation
|
you will use the MATLAB plotting system to display different
|
||||||
of the standard deviation.
|
aspects of experimental data.
|
||||||
\begin{parts}
|
|
||||||
\part Draw $n=50$ random numbers from a normal distribution with
|
In the accompanying zip file (``experiment.zip'') you find the
|
||||||
mean $\mu=3$ and standard deviation $\sigma=2$.
|
results of an (hypothetical) behavioral experiment. Animals have
|
||||||
|
been trained in a number of ``sessions'' in a two-alternative-forced
|
||||||
\part Plot the likelihood (computed as the product of probabilities)
|
choice task. The results of each session and each subject are stored
|
||||||
and the log-likelihood (sum of the logarithms of the probabilities)
|
in separate ``.mat'' files that are named according to the pattern
|
||||||
as a function of the standard deviation. Compare the position of the
|
\verb+Animal_{id}_Session_{id}.mat+. Each of these mat files contains
|
||||||
maxima with the standard deviation that you compute directly from
|
three variables \verb+performance, perf_std, trials, tasks+ which list
|
||||||
the data.
|
the subjects average performance, the standard deviation, the number
|
||||||
|
of completed trials, and the tasks, respectively.
|
||||||
\part Increase $n$ to 1000. What happens to the likelihood, what
|
|
||||||
happens to the log-likelihood? Why?
|
As a first step accustom yourself with the data structure and open a
|
||||||
\end{parts}
|
single file to understand what is stored.
|
||||||
\begin{solution}
|
|
||||||
\lstinputlisting{mlestd.m}
|
Solve the assignments below in separate
|
||||||
\includegraphics[width=1\textwidth]{mlestd}\\
|
functions. Create a script that controls the analysis. Try to
|
||||||
|
use as little ``hardcode'' as possible.
|
||||||
The more data the smaller the product of the probabilities ($\approx
|
|
||||||
p^n$ with $0 \le p < 1$) and the smaller the sum of the logarithms
|
All plots must be properly labeled, use a font size of 10\,pt for
|
||||||
of the probabilities ($\approx n\log p$, note that $\log p < 0$).
|
axis labels and a font size of 8\,pt for tick-labels and
|
||||||
|
legends. Single-panel figures should have the paper size of 7.5 by
|
||||||
The product eventually gets smaller than the precision of the
|
7.5\,cm, multi-panel figures, may use a width of 17.5\,cm and a height
|
||||||
floating point numbers support. Therefore for $n=1000$ the products
|
of 15\,cm. Save the figures in the pdf format.
|
||||||
becomes zero. Using the logarithm avoids this numerical problem.
|
|
||||||
\end{solution}
|
\begin{parts}
|
||||||
|
\part{} Find out the ids of the used subjects (you will need this information later). Use the \verb+dir+ function to get the name of
|
||||||
|
a mat file.
|
||||||
|
\part{} Use the \verb+dir+ function to find out how many sessions have been recorded for
|
||||||
|
each subject. Plot the results in a bar plot, adhere to the instructions above and save the figure.
|
||||||
|
\part{} Each animal has been tested in the same tasks. Create a figure which plots each subject's performance as a function of the session number. The figure is supposed to display the three tasks in three subplots. Within each subplot plot the average performance and the performance error.
|
||||||
|
\part{} Not every subject solved the same number of trials. Collect for each subject and each task the total number of trials and plot the data in form of a stacked bar plot. That is, the figure shows a stacked bar for each subject.
|
||||||
|
\end{parts}
|
||||||
|
%\begin{solution}
|
||||||
|
% \lstinputlisting{mlestd.m}
|
||||||
|
% \includegraphics[width=1\textwidth]{mlestd}\\
|
||||||
|
% \end{solution}
|
||||||
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\question \qt{Maximum-likelihood estimator of a line through the origin}
|
|
||||||
In the lecture we derived the following equation for an
|
|
||||||
maximum-likelihood estimate of the slope $\theta$ of a straight line
|
|
||||||
through the origin fitted to $n$ pairs of data values $(x_i|y_i)$ with
|
|
||||||
standard deviation $\sigma_i$:
|
|
||||||
\[\theta = \frac{\sum_{i=1}^n \frac{x_i y_i}{\sigma_i^2}}{ \sum_{i=1}^n
|
|
||||||
\frac{x_i^2}{\sigma_i^2}} \]
|
|
||||||
\begin{parts}
|
|
||||||
\part \label{mleslopefunc} Write a function that takes two vectors
|
|
||||||
$x$ and $y$ containing the data pairs and returns the slope,
|
|
||||||
computed according to this equation. For simplicity we assume
|
|
||||||
$\sigma_i=\sigma$ for all $1 \le i \le n$. How does this simplify
|
|
||||||
the equation for the slope?
|
|
||||||
\begin{solution}
|
|
||||||
\lstinputlisting{mleslope.m}
|
|
||||||
\end{solution}
|
|
||||||
|
|
||||||
\part Write a script that generates data pairs that scatter around a
|
|
||||||
line through the origin with a given slope. Use the function from
|
|
||||||
\pref{mleslopefunc} to compute the slope from the generated data.
|
|
||||||
Compare the computed slope with the true slope that has been used to
|
|
||||||
generate the data. Plot the data togehther with the line from which
|
|
||||||
the data were generated and the maximum-likelihood fit.
|
|
||||||
\begin{solution}
|
|
||||||
\lstinputlisting{mlepropfit.m}
|
|
||||||
\includegraphics[width=1\textwidth]{mlepropfit}
|
|
||||||
\end{solution}
|
|
||||||
|
|
||||||
\part \label{mleslopecomp} Vary the number of data pairs, the slope,
|
|
||||||
as well as the variance of the data points around the true
|
|
||||||
line. Under which conditions is the maximum-likelihood estimation of
|
|
||||||
the slope closer to the true slope?
|
|
||||||
|
|
||||||
\part To answer \pref{mleslopecomp} more precisely, generate for
|
|
||||||
each condition let's say 1000 data sets and plot a histogram of the
|
|
||||||
estimated slopes. How does the histogram, its mean and standard
|
|
||||||
deviation relate to the true slope?
|
|
||||||
\end{parts}
|
|
||||||
\begin{solution}
|
|
||||||
\lstinputlisting{mlepropest.m}
|
|
||||||
\includegraphics[width=1\textwidth]{mlepropest}\\
|
|
||||||
The estimated slopes are centered around the true slope. The
|
|
||||||
standard deviation of the estimated slopes gets smaller for larger
|
|
||||||
$n$ and less noise in the data.
|
|
||||||
\end{solution}
|
|
||||||
|
|
||||||
\continue
|
|
||||||
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|
|
||||||
\question \qt{Maximum-likelihood-estimation of a probability-density function}
|
|
||||||
Many probability-density functions have parameters that cannot be
|
|
||||||
computed directly from the data, like, for example, the mean of
|
|
||||||
normally-distributed data. Such parameter need to be estimated by
|
|
||||||
means of the maximum-likelihood from the data.
|
|
||||||
|
|
||||||
Let us demonstrate this approach by means of data that are drawn from a
|
|
||||||
gamma distribution,
|
|
||||||
\begin{parts}
|
|
||||||
\part Find out which \code{matlab} function computes the
|
|
||||||
probability-density function of the gamma distribution.
|
|
||||||
|
|
||||||
\part \label{gammaplot} Use this function to plot the
|
|
||||||
probability-density function of the gamma distribution for various
|
|
||||||
values of the (positive) ``shape'' parameter. Wet set the ``scale''
|
|
||||||
parameter to one.
|
|
||||||
|
|
||||||
\part Find out which \code{matlab} function generates random numbers
|
|
||||||
that are distributed according to a gamma distribution. Generate
|
|
||||||
with this function 50 random numbers using one of the values of the
|
|
||||||
``shape'' parameter used in \pref{gammaplot}.
|
|
||||||
|
|
||||||
\part Compute and plot a properly normalized histogram of these
|
|
||||||
random numbers.
|
|
||||||
|
|
||||||
\part Find out which \code{matlab} function fit a distribution to a
|
|
||||||
vector of random numbers according to the maximum-likelihood method.
|
|
||||||
How do you need to use this function in order to fit a gamma
|
|
||||||
distribution to the data?
|
|
||||||
|
|
||||||
\part Estimate with this function the parameter of the gamma
|
|
||||||
distribution used to generate the data.
|
|
||||||
|
|
||||||
\part Finally, plot the fitted gamma distribution on top of the
|
|
||||||
normalized histogram of the data.
|
|
||||||
\end{parts}
|
|
||||||
\begin{solution}
|
|
||||||
\lstinputlisting{mlepdffit.m}
|
|
||||||
\includegraphics[width=1\textwidth]{mlepdffit}
|
|
||||||
\end{solution}
|
|
||||||
|
|
||||||
\end{questions}
|
\end{questions}
|
||||||
|
|
||||||
\end{document}
|
\end{document}
|
||||||
|
@ -5,6 +5,11 @@
|
|||||||
Neuroethology lab \hfill --- \hfill Institute for Neurobiology \hfill --- \hfill \includegraphics[width=0.28\textwidth]{UT_WBMW_Black_RGB} \\
|
Neuroethology lab \hfill --- \hfill Institute for Neurobiology \hfill --- \hfill \includegraphics[width=0.28\textwidth]{UT_WBMW_Black_RGB} \\
|
||||||
\end{center}
|
\end{center}
|
||||||
|
|
||||||
|
The exercises are meant for self-monitoring and revision of the
|
||||||
|
lecture. You should try to solve them on your own. In contrast
|
||||||
|
to previous exercises, the solutions can not be saved in a single file. Combine the files into a
|
||||||
|
single zip archive and submit it via ILIAS. Name the archive according
|
||||||
|
to the pattern: ``plotting\_\{surname\}.zip''.
|
||||||
% \ifprintanswers%
|
% \ifprintanswers%
|
||||||
% \else
|
% \else
|
||||||
|
|
||||||
|
@ -1,8 +1,9 @@
|
|||||||
import numpy as np
|
import numpy as np
|
||||||
import matplotlib.pyplot as plt
|
|
||||||
import scipy.io as scio
|
import scipy.io as scio
|
||||||
|
import os
|
||||||
from IPython import embed
|
from IPython import embed
|
||||||
|
|
||||||
|
|
||||||
def boltzmann(x, y_max, slope, inflection):
|
def boltzmann(x, y_max, slope, inflection):
|
||||||
"""
|
"""
|
||||||
The underlying Boltzmann function.
|
The underlying Boltzmann function.
|
||||||
@ -34,41 +35,51 @@ class Animal(object):
|
|||||||
self.__responsiveness = responsiveness
|
self.__responsiveness = responsiveness
|
||||||
|
|
||||||
def simulate(self, session_count=10, trials=20, task_difficulties=[]):
|
def simulate(self, session_count=10, trials=20, task_difficulties=[]):
|
||||||
|
"""
|
||||||
|
:param task_difficulties gives a malus on the learning rate range 0 - 1
|
||||||
|
"""
|
||||||
tasks = 1 if len(task_difficulties) == 0 else len(task_difficulties)
|
tasks = 1 if len(task_difficulties) == 0 else len(task_difficulties)
|
||||||
if len(task_difficulties) == 0:
|
if len(task_difficulties) == 0:
|
||||||
task_difficulties = [0]
|
task_difficulties = [0]
|
||||||
avg_perf = np.zeros((session_count, tasks))
|
avg_perf = np.zeros((session_count, tasks))
|
||||||
err_perf = np.zeros((session_count, tasks))
|
err_perf = np.zeros((session_count, tasks))
|
||||||
trials_performed = np.zeros(session_count)
|
trials_performed = np.zeros((session_count, tasks))
|
||||||
for i in range(session_count):
|
for i in range(session_count):
|
||||||
for j in range(tasks):
|
for j in range(tasks):
|
||||||
base_performance = boltzmann(i, 1.0, self.__learning_rate/20, self.__delay)
|
learning_rate = self.__learning_rate * (1-task_difficulties[j])
|
||||||
penalty = base_performance * task_difficulties[j] * 0.5
|
base_performance = boltzmann(i, 1.0, learning_rate, self.__delay) * 0.5 + 0.5
|
||||||
base_perf = 50 + 50 * (base_performance - penalty)
|
noise = np.random.randn(trials) * (self.__volatility * (1-task_difficulties[j]))
|
||||||
|
performances = np.random.rand(trials) < (base_performance + noise)
|
||||||
trials_completed = np.random.rand(trials) < self.__responsiveness
|
trials_completed = np.random.rand(trials) < self.__responsiveness
|
||||||
performances = np.random.randn(trials) * self.__volatility * 100 + base_perf
|
trials_performed[i, j] = np.sum(trials_completed)
|
||||||
avg_perf[i, j] = np.mean(performances[trials_completed])
|
avg_perf[i, j] = np.sum(performances[trials_completed]) / trials_performed[i, j]
|
||||||
err_perf[i, j] = np.std(performances[trials_completed])
|
err_perf[i, j] = np.sqrt(trials_performed[i, j] * (avg_perf[i, j]/100) * (1 - avg_perf[i, j]))
|
||||||
trials_performed = np.sum(trials_completed)
|
|
||||||
return avg_perf, err_perf, trials_performed
|
return avg_perf, err_perf, trials_performed
|
||||||
|
|
||||||
|
|
||||||
if __name__ == "__main__":
|
def save_performance(avg_perf, err_perf, trials_completed, tasks, animal_id):
|
||||||
|
result_folder="experiment"
|
||||||
|
for i in range(avg_perf.shape[0]):
|
||||||
|
performance = avg_perf[i, :]
|
||||||
|
error = err_perf[i, :]
|
||||||
|
trials = trials_completed[i, :]
|
||||||
|
scio.savemat(os.path.join(result_folder, "Animal_%i_Session_%i.mat" % (animal_id, i+1)),
|
||||||
|
{"performance": performance, "perf_std": error, "trials": trials, "tasks": tasks})
|
||||||
|
|
||||||
session_count = 30
|
|
||||||
task_difficulties = [0, 0.3, 1.]
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
session_count = [25, 32, 40, 30]
|
||||||
|
task_difficulties = [0, 0.75, 0.95]
|
||||||
delays = [5, 10, 12, 20]
|
delays = [5, 10, 12, 20]
|
||||||
learning_rates = np.array([5, 10, 2, 20])
|
learning_rates = np.array([0.25, 0.5, 1., 1.5])
|
||||||
volatilities = np.random.rand(4) * 0.5
|
volatilities = np.random.rand(4) * 0.25
|
||||||
responsivness = np.random.rand(4) * 0.5 + 0.5
|
responsivness = np.random.rand(4) * 0.25 + 0.75
|
||||||
|
|
||||||
for i in range(len(delays)):
|
for i in range(len(delays)):
|
||||||
d = delays[i]
|
d = delays[i]
|
||||||
lr = learning_rates[i]
|
lr = learning_rates[i]
|
||||||
v = volatilities[i],
|
v = volatilities[i]
|
||||||
r = responsivness[i]
|
r = responsivness[i]
|
||||||
a = Animal(d, lr, v, r)
|
a = Animal(d, lr, v, r)
|
||||||
ap, ep, tp = a.simulate(session_count=session_count, task_difficulties=[0, 0.3, 0.6])
|
ap, ep, tp = a.simulate(session_count=session_count[i], task_difficulties=task_difficulties)
|
||||||
plt.plot(ap)
|
save_performance(ap, ep, tp, ['a', 'b', 'c'], i+1)
|
||||||
embed()
|
|
||||||
|
Reference in New Issue
Block a user