\documentclass[a4paper,12pt,pdftex]{exam} \newcommand{\ptitle}{Mutual information} \input{../header.tex} \firstpagefooter{Supervisor: Jan Benda}{phone: 29 74573}% {email: jan.benda@uni-tuebingen.de} \begin{document} \input{../instructions.tex} %%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%% The mutual information is a measure from information theory that is used in neuroscience to quantify, for example, how much information a spike train carries about a sensory stimulus. It quantifies the dependence of an output $y$ (e.g. a spike train) on some input $x$ (e.g. a sensory stimulus). The probability of each of $n$ input values $x = {x_1, x_2, ... x_n}$ is given by the corresponding probabilty distribution $P(x)$. The entropy \begin{equation} \label{entropy} H[x] = - \sum_{x} P(x) \log_2 P(x) \end{equation} is a measure for the surprise of getting a specific value of $x$. For example, if from two possible values '1' and '2', the probability of getting a '1' is close to one ($P(1) \approx 1$) then the probability of getting a '2' is close to zero ($P(2) \approx 0$). For this case the entropy, the surprise level, is almost zero, because both $0 \log 0 = 0$ and $1 \log 1 = 0$. It is not surprising at all that you almost always get a '1'. The entropy is largest for equally likely outcomes of $x$. If getting a '1' or a '2' is equally likely then you will be most surprised by each new number you get, because you can not predict them. Mutual information measures information transmitted between an input and an output. It is computed from the probability distributions of the input, $P(x)$, the output $P(y)$ and their joint distribution $P(x,y)$: \begin{equation} \label{mi} I[x:y] = \sum_{x}\sum_{y} P(x,y) \log_2\frac{P(x,y)}{P(x)P(y)} \end{equation} where the sums go over all possible values of $x$ and $y$. The mutual information can be also expressed in terms of entropies. Mutual information is the entropy of the outputs $y$ reduced by the entropy of the outputs given the input: \begin{equation} \label{mientropy} I[x:y] = E[y] - E[x|y] \end{equation} The following project is meant to explore the concept of mutual information with the help of a simple example. \begin{questions} \question A subject was presented two possible objects for a very brief time ($50$\,ms). The task of the subject was to report which of the two objects was shown. In {\tt decisions.mat} you find an array that stores which object was presented in each trial and which object was reported by the subject. \begin{parts} \part Plot the raw data (no sums or probabilities) appropriately. \part Compute and plot the probability distributions of presented and reported objects. \part Compute a 2-d histogram that shows how often different combinations of reported and presented came up. \part Normalize the histogram such that it sums to one (i.e. make it a probability distribution $P(x,y)$ where $x$ is the presented object and $y$ is the reported object). \part Use the computed probability distributions to compute the mutual information \eqref{mi} that the answers provide about the actually presented object. \part Use a permutation test to compute the $95\%$ confidence interval for the mutual information estimate in the dataset from {\tt decisions.mat}. Does the measured mutual information indicate signifikant information transmission? \end{parts} \question What is the maximally achievable mutual information? \begin{parts} \part Show this numerically by generating your own datasets which naturally should yield maximal information. Consider different distributions of $P(x)$. \part Compare the maximal mutual information with the corresponding entropy \eqref{entropy}. \end{parts} \question What is the minimum possible mutual information? This is the mutual information between an output is independent of the input. How is the joint distribution $P(x,y)$ related to the marginls $P(x)$ and $P(y)$ if $x$ and $y$ are independent? What is the value of the logarithm in eqn.~\eqref{mi} in this case? So what is the resulting value for the mutual information? \end{questions} Hint: You may encounter a problem when computing the mutual information whenever $P(x,y)$ equals zero. For treating this special case think about (plot it) what the limit of $x \log x$ is for $x$ approaching zero. Use this information to fix the computation of the mutual information. \end{document}