\documentclass[a4paper,12pt,pdftex]{exam}

\newcommand{\ptitle}{Mutual information}
\input{../header.tex}
\firstpagefooter{Supervisor: Jan Benda}{phone: 29 74573}%
{email: jan.benda@uni-tuebingen.de}

\begin{document}

\input{../instructions.tex}


%%%%%%%%%%%%%% Questions %%%%%%%%%%%%%%%%%%%%%%%%%
The mutual information is a measure from information theory that is
used in neuroscience to quantify, for example, how much information  a
spike train carries about a sensory stimulus. It quantifies the
dependence of an output $y$ (e.g. a spike train) on some input $x$
(e.g. a sensory stimulus).

The probability of each of $n$ input values $x = {x_1, x_2, ... x_n}$
is given by the corresponding probabilty distribution $P(x)$.  The entropy
\begin{equation}
  \label{entropy}
  H[x] = - \sum_{x}  P(x) \log_2 P(x)
\end{equation}
is a measure for the surprise of getting a specific value of $x$. For
example, if from two possible values '1' and '2', the probability of
getting a '1' is close to one ($P(1) \approx 1$) then the probability
of getting a '2' is close to zero ($P(2) \approx 0$). For this case
the entropy, the surprise level, is almost zero, because both $0 \log
0 = 0$ and $1 \log 1 = 0$. It is not surprising at all that you almost
always get a '1'. The entropy is largest for equally likely outcomes
of $x$. If getting a '1' or a '2' is equally likely then you will be
most surprised by each new number you get, because you can not predict
them.

Mutual information measures information transmitted between an input
and an output. It is computed from the probability distributions of
the input, $P(x)$, the output $P(y)$ and their joint distribution
$P(x,y)$:
\begin{equation}
  \label{mi}
  I[x:y] = \sum_{x}\sum_{y} P(x,y) \log_2\frac{P(x,y)}{P(x)P(y)}
\end{equation}
where the sums go over all possible values of $x$ and $y$.  The mutual
information can be also expressed in terms of entropies. Mutual
information is the entropy of the outputs $y$ reduced by the entropy
of the outputs given the input:
\begin{equation}
  \label{mientropy}
  I[x:y] = E[y] - E[x|y]
\end{equation}

The following project is meant to explore the concept of mutual
information with the help of a simple example.

\begin{questions}
  \question A subject was presented two possible objects for a very
  brief time ($50$\,ms). The task of the subject was to report which of
  the two objects was shown. In {\tt decisions.mat} you find an array
  that stores which object was presented in each trial and which
  object was reported by the subject. 
  
  \begin{parts}
    \part Plot the raw data (no sums or probabilities) appropriately.

    \part Compute and plot the probability distributions of presented
    and reported objects.

    \part Compute a 2-d histogram that shows how often different
    combinations of reported and presented came up.

    \part Normalize the histogram such that it sums to one (i.e. make
    it a probability distribution $P(x,y)$ where $x$ is the presented
    object and $y$ is the reported object).
    
    \part Use the computed probability distributions to compute the mutual
    information \eqref{mi} that the answers provide about the
    actually presented object.

    \part Use a permutation test to compute the $95\%$ confidence
    interval for the mutual information estimate in the dataset from
    {\tt decisions.mat}. Does the measured mutual information indicate
    signifikant information transmission?
    
  \end{parts}
  
  \question What is the maximally achievable mutual information?
  
  \begin{parts}
    \part   Show this numerically by generating your own datasets which
    naturally should yield maximal information. Consider different
    distributions of $P(x)$.

    \part Compare the maximal mutual information with the corresponding
    entropy \eqref{entropy}.
  \end{parts}

  \question What is the minimum possible mutual information?

  This is the mutual information between an output is independent of the
  input.

  How is the joint distribution $P(x,y)$ related to the marginls
  $P(x)$ and $P(y)$ if $x$ and $y$ are independent? What is the value
  of the logarithm in eqn.~\eqref{mi} in this case? So what is the
  resulting value for the mutual information?
  
\end{questions}

Hint: You may encounter a problem when computing the mutual
information whenever $P(x,y)$ equals zero. For treating this special
case think about (plot it) what the limit of $x \log x$ is for $x$
approaching zero. Use this information to fix the computation of the
mutual information.

\end{document}