diff --git a/main.pdf b/main.pdf index 378c086..0bdda47 100644 Binary files a/main.pdf and b/main.pdf differ diff --git a/main.tex b/main.tex index 667fd18..92d6ea9 100644 --- a/main.tex +++ b/main.tex @@ -105,6 +105,7 @@ \newcommand{\pc}{p(c,\,T)} % Probability density (general interval) \newcommand{\pclp}{p(c,\,\tlp)} % Probability density (lowpass interval) \newcommand{\pci}{p(c_i,\,\tlp)} % Kernel-specific probability density (lowpass interval) +\newcommand{\tstat}{T_{\text{total}}} % Time interval where c(t) is stationary \newcommand{\muf}{\mu_{f_i}} % Average feature value \section{Introduction} @@ -827,11 +828,10 @@ between the resulting $\env(t)$, $\db(t)$, and $\adapt(t)$. It is necessary to use $\filt(t)$ as input for this analysis instead of $\env(t)$, because $\env(t)$ results from a nonlinear transformation and hence cannot be synthesized as an additive mixture of song component $\soc(t)$ and noise -component $\noc(t)$. % <-- Sentence may be methods section material. -However, it is much easier to conceive a mathematical description of the -effects of logarithmic compression and adaptation if $\env(t)$ itself is -assumed to be composed of $\soc(t)$ and $\noc(t)$. In the noiseless -case~(Fig.\,\ref{fig:log-hp}a), $\env(t)$ takes the form of +component $\noc(t)$. However, it is much easier to conceive a mathematical +description of the effects of logarithmic compression and adaptation if +$\env(t)$ itself is assumed to be composed of $\soc(t)$ and $\noc(t)$. In the +noiseless case~(Fig.\,\ref{fig:log-hp}a), $\env(t)$ takes the form of \begin{equation} \env(t)\,=\,\sca\,\cdot\,\soc(t), \qquad \env(t)\,>\,0\enspace\forall\enspace t\,\in\,\mathbb{R} \label{eq:toy_env_pure} @@ -1000,35 +1000,44 @@ corresponding $\tlp$: f(t)\,\approx\,\int_{\Theta}^{+\infty} \pclp\,dc\,=\,P(c\,>\,\Theta,\,\tlp) \label{eq:feat_prop} \end{equation} +% Little bit of patch-work here... +% 1) Interpretation of the feature value: In a sense, $f(t)$ can be interpreted as some sort of duty cycle of $c(t)$ with respect to $\Theta$. For example, a feature value of $f(t)=0.4$ means that $c(t)$ exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$. In the most extreme cases, $\Theta$ lays either above the maximum of $c(t)$ or below the minimum of $c(t)$, which results in a minimum or maximum possible feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left column) or -$f(t)=1$, respectively. Furthermore, if $c(t)$ is stationary --- so that its -statistics do not change substantially over time --- and if $\tlp$ is much -longer than the relevant time scales of $c(t)$, then $\pclp$ is largely -independent of $t$. In this case, $f(t)$ is approximately constant across -$t$~(Fig.\,\ref{fig:stages_feat}c). +$f(t)=1$, respectively. -Importantly, $f(t)$ neither retains information about the timing of individual -threshold crossings nor the precise values of $c(t)$ apart from their relation -to $\Theta$. Different $\sca$ can hence result in similar feature values by -producing similar $T_1$ segments. The most reliable way of exploiting this -invariant property of $f(t)$ is to set $\Theta$ to a value near 0, because -these values are least affected by different scales of $c(t)$. For sufficiently -large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in both the -noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation -regime). +% 2) Constant feature values across t: +If the time $T_1$ where $c(t)>\Theta$ within $\tlp$ is approximately constant +across $t$ for some time interval $\tstat>\tlp$, then $f(t)$ is approximately +constant across $t\in\tstat$ as well~(Fig.\,\ref{fig:stages_feat}c). This is +fulfilled if $c(t)$ is stationary in the sense that its distribution $\pclp$ +does not change substantially within $\tstat$, which requires that $\tlp$ is +much longer than the relevant time scales of $c(t)$. However, stationarity of +$c(t)$ is not a necessary condition for $f(t)$ to be constant because $f(t)$ +depends only on the total $T_1$ --- irrespective of the timing of individual +threshold crossings --- and different $\pclp$ can, in principle, still result +in similar $T_1$. -The saturation level of $f(t)$ is independent of the precise value of $\Theta$, -but the saturation point decreases with -$\Theta$~(Fig.\,\ref{fig:thresh-lp_single}e). Therefore, a threshold value of -$\Theta=0$ would be the optimal choice for achieving intensity invariance at -the lowest possible $\sca$. In stark contrast, the closer $\Theta$ is to 0, the -higher $\mu_f$ in response to the pure noise component $\noc(t)$ and the lower -the resulting SNR of $f(t)$ between noise regime and saturation +% 3) Constant feature values across alpha: +Similarly, $f(t)$ retains no information about the precise values of $c(t)$ +apart from their relation to $\Theta$. Different scales $\sca$ can hence result +in similar values of $f(t)$ as long as $T_1$ remains similar across $\sca$. The +most reliable way of exploiting this invariant property of $f(t)$ is to set +$\Theta$ to a value near 0, because these values are least affected by +different scales of $c(t)$. For sufficiently large $\sca$, $f(t)$ then +approaches the same constant $\mu_f$ in both the noiseless and the noisy +case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation regime). The saturation +level of $f(t)$ is independent of the precise value of $\Theta$, but the +saturation point decreases with $\Theta$~(Fig.\,\ref{fig:thresh-lp_single}e). +Therefore, a threshold value of $\Theta=0$ would be the optimal choice for +achieving intensity invariance at the lowest possible $\sca$. In stark +contrast, the closer $\Theta$ is to 0, the higher $\mu_f$ in response to the +pure noise component $\noc(t)$ and the lower the resulting SNR of $f(t)$ +between noise regime and saturation regime~(Fig.\,\ref{fig:thresh-lp_single}b-d, left column, and Fig.\,\ref{fig:thresh-lp_single}e). This trade-off between intensity invariance and SNR has already been observed during the previous analysis on logarithmic @@ -1581,89 +1590,30 @@ constitutes the basis for song recognition. The songs of different species are represented by specific combinations of feature values, which should be as constant as possible for the duration of a song to fasciliate recognition. The fundamental requirement for a constant feature $f_i(t)$ is that the time where -kernel response $c_i(t)$ exceeds the threshold value $\thr$ within averaging -interval $\tlp$ is the same for all time points $t$. This is fulfilled if -$c_i(t)$ is stationary within a certain time window and $\tlp$ is much longer -than the relevant time scales of $c_i(t)$, so that the distribution $\pci$ of -$c_i(t)$ and hence the value of $f_i(t)$ remain stable across $t$. - -% Practical cases that allow for approximately constant features: -There are two very different practical cases in which $c_i(t)$ could fulfill -the stationarity requirement. First, $c_i(t)$ is - -can be assumed to be -stationary: Either $c_i(t)$ is entirely unstructured on most time scales, or -$c_i(t)$ is periodic. +kernel response $c_i(t)$ exceeds the threshold value $\thr$ is approximately -First, $c_i(t)$ is entirely unstructured on most time scales, which +Each +feature $f_i(t)$ approximately quantifies the proportion of time where kernel +response $c_i(t)$ exceeds the threshold value $\thr$ within the averaging +interval $\tlp$. The value of $f_i(t)$ at time point $t$ is hence determined by +the distribution $\pci$ of $c_i(t)$ around $t$. +Accordingly, if $c_i(t)$ is +stationary within some time interval $T>\tlp$ --- so that $\pci$ does not +change substantially with $t$ --- then the value of $f_i(t)$ is approximately +constant across $t$. - -The structure of noise-evoked $c_i(t)$ is largely random with an -approximately normal $\pci$ with constant mean and variance across $t$. - -Either the structure of $c_i(t)$ is largely random, or - -Noise-evoked $c_i(t)$ are - -Noise-evoked $c_i(t)$ are largely unstructured and follow a roughly -normal $\pci$ with constant mean and variance across $t$. In contrast, -song-evoked $c_i(t)$ are highly - -Noise-evoked $c_i(t)$ fulfill the stationary requirement because their $\pci$ -is approximately a normal distribution with constant mean and variance across -$t$. - -Each feature $f_i(t)$ approximately -quantifies the proportion of time where the respective kernel response $c_i(t)$ -exceeds the threshold value $\thr$ within the averaging interval $\tlp$. The -songs of different species are represented by specific combinations of feature -values, which should preferably be as constant as possible during a song to -fasciliate recognition. The fundamental requirement for constant $f_i(t)$ is -that the time where $c_i(t)>\thr$ within $\tlp$ is the same for all $t$, which -is fulfilled if the distribution $\pci$ of - - -The -value of $f_i(t)$ is hence determined by $\thr$ with respect to the -distribution $\pci$ of $c_i(t)$. - - -$c_i(t)>\thr$. - -The feature set is the final song representation along the model pathway and -constitutes the basis for song recognition. Each feature $f_i(t)$ results from -the thresholding of the respective kernel response $c_i(t)$ by $\nl$ and the -subsequent temporal averaging of binary response $b_i(t)$ by a lowpass filter -with cutoff frequency $\fc$, which specifies the averaging interval $\tlp$. -Feature $f_i(t)$ approximately quantifies the proportion of time during which -$c_i(t)$ exceeds the threshold value $\thr$ within $\tlp$. The value of -$f_i(t)$ at time point $t$ is hence determined by $\thr$ with respect to the -distribution $\pci$ of $c_i(t)$ around $t$ and restricted to the interval -$[0,1]$. - -% Theoretical constraints for constant features: -The songs of different species are represented by specific combinations of -values across the feature set, which should preferably be constant for the -duration of a song to fasciliate recognition. The fundamental requirement for -constant $f_i(t)$ is that the time where $c_i(t)>\thr$ within $\tlp$ is the -same for all $t$. - -This is fulfilled if $c_i(t)$ is stationary across $t$ and -$\tlp$ is much longer than the relevant time scales of $c_i(t)$, so that $\pci$ -is independent of $t$. - - -, which is fulfilled if $\pci$ is stable across $t$. - -The most -straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is stationary -and $\tlp$ is sufficiently long to average over the stationary distribution of - -The most -straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is periodic and -$\tlp$ is sufficiently long to average over multiple cycles of $c_i(t)$. +If the time $T_1$ where $c(t)>\Theta$ within $\tlp$ is approximately constant +across $t$ for some time interval $\tstat>\tlp$, then $f(t)$ is approximately +constant across $t\in\tstat$ as well~(Fig.\,\ref{fig:stages_feat}c). This is +fulfilled if $c(t)$ is stationary in the sense that its distribution $\pclp$ +does not change substantially within $\tstat$, which requires that $\tlp$ is +much longer than the relevant time scales of $c(t)$. However, stationarity of +$c(t)$ is not a necessary condition for $f(t)$ to be constant because $f(t)$ +depends only on the total $T_1$ --- irrespective of the timing of individual +threshold crossings --- and different $\pclp$ can, in principle, still result +in similar $T_1$. Most song-evoked $c_i(t)$ are indeed highly repetitive, albeit not perfectly periodic, which is largely an inherited property of the song itself.