Syncing to home.

This commit is contained in:
j-hartling
2026-06-08 14:20:35 +02:00
parent 8ee35b3a27
commit 48dba2bc01
2 changed files with 55 additions and 105 deletions

BIN
main.pdf

Binary file not shown.

160
main.tex
View File

@@ -105,6 +105,7 @@
\newcommand{\pc}{p(c,\,T)} % Probability density (general interval) \newcommand{\pc}{p(c,\,T)} % Probability density (general interval)
\newcommand{\pclp}{p(c,\,\tlp)} % Probability density (lowpass interval) \newcommand{\pclp}{p(c,\,\tlp)} % Probability density (lowpass interval)
\newcommand{\pci}{p(c_i,\,\tlp)} % Kernel-specific probability density (lowpass interval) \newcommand{\pci}{p(c_i,\,\tlp)} % Kernel-specific probability density (lowpass interval)
\newcommand{\tstat}{T_{\text{total}}} % Time interval where c(t) is stationary
\newcommand{\muf}{\mu_{f_i}} % Average feature value \newcommand{\muf}{\mu_{f_i}} % Average feature value
\section{Introduction} \section{Introduction}
@@ -827,11 +828,10 @@ between the resulting $\env(t)$, $\db(t)$, and $\adapt(t)$. It is necessary to
use $\filt(t)$ as input for this analysis instead of $\env(t)$, because use $\filt(t)$ as input for this analysis instead of $\env(t)$, because
$\env(t)$ results from a nonlinear transformation and hence cannot be $\env(t)$ results from a nonlinear transformation and hence cannot be
synthesized as an additive mixture of song component $\soc(t)$ and noise synthesized as an additive mixture of song component $\soc(t)$ and noise
component $\noc(t)$. % <-- Sentence may be methods section material. component $\noc(t)$. However, it is much easier to conceive a mathematical
However, it is much easier to conceive a mathematical description of the description of the effects of logarithmic compression and adaptation if
effects of logarithmic compression and adaptation if $\env(t)$ itself is $\env(t)$ itself is assumed to be composed of $\soc(t)$ and $\noc(t)$. In the
assumed to be composed of $\soc(t)$ and $\noc(t)$. In the noiseless noiseless case~(Fig.\,\ref{fig:log-hp}a), $\env(t)$ takes the form of
case~(Fig.\,\ref{fig:log-hp}a), $\env(t)$ takes the form of
\begin{equation} \begin{equation}
\env(t)\,=\,\sca\,\cdot\,\soc(t), \qquad \env(t)\,>\,0\enspace\forall\enspace t\,\in\,\mathbb{R} \env(t)\,=\,\sca\,\cdot\,\soc(t), \qquad \env(t)\,>\,0\enspace\forall\enspace t\,\in\,\mathbb{R}
\label{eq:toy_env_pure} \label{eq:toy_env_pure}
@@ -1000,35 +1000,44 @@ corresponding $\tlp$:
f(t)\,\approx\,\int_{\Theta}^{+\infty} \pclp\,dc\,=\,P(c\,>\,\Theta,\,\tlp) f(t)\,\approx\,\int_{\Theta}^{+\infty} \pclp\,dc\,=\,P(c\,>\,\Theta,\,\tlp)
\label{eq:feat_prop} \label{eq:feat_prop}
\end{equation} \end{equation}
% Little bit of patch-work here...
% 1) Interpretation of the feature value:
In a sense, $f(t)$ can be interpreted as some sort of duty cycle of $c(t)$ with In a sense, $f(t)$ can be interpreted as some sort of duty cycle of $c(t)$ with
respect to $\Theta$. For example, a feature value of $f(t)=0.4$ means that respect to $\Theta$. For example, a feature value of $f(t)=0.4$ means that
$c(t)$ exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$. In $c(t)$ exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$. In
the most extreme cases, $\Theta$ lays either above the maximum of $c(t)$ or the most extreme cases, $\Theta$ lays either above the maximum of $c(t)$ or
below the minimum of $c(t)$, which results in a minimum or maximum possible below the minimum of $c(t)$, which results in a minimum or maximum possible
feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left column) or feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left column) or
$f(t)=1$, respectively. Furthermore, if $c(t)$ is stationary --- so that its $f(t)=1$, respectively.
statistics do not change substantially over time --- and if $\tlp$ is much
longer than the relevant time scales of $c(t)$, then $\pclp$ is largely
independent of $t$. In this case, $f(t)$ is approximately constant across
$t$~(Fig.\,\ref{fig:stages_feat}c).
Importantly, $f(t)$ neither retains information about the timing of individual % 2) Constant feature values across t:
threshold crossings nor the precise values of $c(t)$ apart from their relation If the time $T_1$ where $c(t)>\Theta$ within $\tlp$ is approximately constant
to $\Theta$. Different $\sca$ can hence result in similar feature values by across $t$ for some time interval $\tstat>\tlp$, then $f(t)$ is approximately
producing similar $T_1$ segments. The most reliable way of exploiting this constant across $t\in\tstat$ as well~(Fig.\,\ref{fig:stages_feat}c). This is
invariant property of $f(t)$ is to set $\Theta$ to a value near 0, because fulfilled if $c(t)$ is stationary in the sense that its distribution $\pclp$
these values are least affected by different scales of $c(t)$. For sufficiently does not change substantially within $\tstat$, which requires that $\tlp$ is
large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in both the much longer than the relevant time scales of $c(t)$. However, stationarity of
noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation $c(t)$ is not a necessary condition for $f(t)$ to be constant because $f(t)$
regime). depends only on the total $T_1$ --- irrespective of the timing of individual
threshold crossings --- and different $\pclp$ can, in principle, still result
in similar $T_1$.
The saturation level of $f(t)$ is independent of the precise value of $\Theta$, % 3) Constant feature values across alpha:
but the saturation point decreases with Similarly, $f(t)$ retains no information about the precise values of $c(t)$
$\Theta$~(Fig.\,\ref{fig:thresh-lp_single}e). Therefore, a threshold value of apart from their relation to $\Theta$. Different scales $\sca$ can hence result
$\Theta=0$ would be the optimal choice for achieving intensity invariance at in similar values of $f(t)$ as long as $T_1$ remains similar across $\sca$. The
the lowest possible $\sca$. In stark contrast, the closer $\Theta$ is to 0, the most reliable way of exploiting this invariant property of $f(t)$ is to set
higher $\mu_f$ in response to the pure noise component $\noc(t)$ and the lower $\Theta$ to a value near 0, because these values are least affected by
the resulting SNR of $f(t)$ between noise regime and saturation different scales of $c(t)$. For sufficiently large $\sca$, $f(t)$ then
approaches the same constant $\mu_f$ in both the noiseless and the noisy
case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation regime). The saturation
level of $f(t)$ is independent of the precise value of $\Theta$, but the
saturation point decreases with $\Theta$~(Fig.\,\ref{fig:thresh-lp_single}e).
Therefore, a threshold value of $\Theta=0$ would be the optimal choice for
achieving intensity invariance at the lowest possible $\sca$. In stark
contrast, the closer $\Theta$ is to 0, the higher $\mu_f$ in response to the
pure noise component $\noc(t)$ and the lower the resulting SNR of $f(t)$
between noise regime and saturation
regime~(Fig.\,\ref{fig:thresh-lp_single}b-d, left column, and regime~(Fig.\,\ref{fig:thresh-lp_single}b-d, left column, and
Fig.\,\ref{fig:thresh-lp_single}e). This trade-off between intensity invariance Fig.\,\ref{fig:thresh-lp_single}e). This trade-off between intensity invariance
and SNR has already been observed during the previous analysis on logarithmic and SNR has already been observed during the previous analysis on logarithmic
@@ -1581,89 +1590,30 @@ constitutes the basis for song recognition. The songs of different species are
represented by specific combinations of feature values, which should be as represented by specific combinations of feature values, which should be as
constant as possible for the duration of a song to fasciliate recognition. The constant as possible for the duration of a song to fasciliate recognition. The
fundamental requirement for a constant feature $f_i(t)$ is that the time where fundamental requirement for a constant feature $f_i(t)$ is that the time where
kernel response $c_i(t)$ exceeds the threshold value $\thr$ within averaging kernel response $c_i(t)$ exceeds the threshold value $\thr$ is approximately
interval $\tlp$ is the same for all time points $t$. This is fulfilled if
$c_i(t)$ is stationary within a certain time window and $\tlp$ is much longer
than the relevant time scales of $c_i(t)$, so that the distribution $\pci$ of
$c_i(t)$ and hence the value of $f_i(t)$ remain stable across $t$.
% Practical cases that allow for approximately constant features:
There are two very different practical cases in which $c_i(t)$ could fulfill
the stationarity requirement. First, $c_i(t)$ is
can be assumed to be
stationary: Either $c_i(t)$ is entirely unstructured on most time scales, or
$c_i(t)$ is periodic.
First, $c_i(t)$ is entirely unstructured on most time scales, which Each
feature $f_i(t)$ approximately quantifies the proportion of time where kernel
response $c_i(t)$ exceeds the threshold value $\thr$ within the averaging
interval $\tlp$. The value of $f_i(t)$ at time point $t$ is hence determined by
the distribution $\pci$ of $c_i(t)$ around $t$.
Accordingly, if $c_i(t)$ is
stationary within some time interval $T>\tlp$ --- so that $\pci$ does not
change substantially with $t$ --- then the value of $f_i(t)$ is approximately
constant across $t$.
If the time $T_1$ where $c(t)>\Theta$ within $\tlp$ is approximately constant
The structure of noise-evoked $c_i(t)$ is largely random with an across $t$ for some time interval $\tstat>\tlp$, then $f(t)$ is approximately
approximately normal $\pci$ with constant mean and variance across $t$. constant across $t\in\tstat$ as well~(Fig.\,\ref{fig:stages_feat}c). This is
fulfilled if $c(t)$ is stationary in the sense that its distribution $\pclp$
Either the structure of $c_i(t)$ is largely random, or does not change substantially within $\tstat$, which requires that $\tlp$ is
much longer than the relevant time scales of $c(t)$. However, stationarity of
Noise-evoked $c_i(t)$ are $c(t)$ is not a necessary condition for $f(t)$ to be constant because $f(t)$
depends only on the total $T_1$ --- irrespective of the timing of individual
Noise-evoked $c_i(t)$ are largely unstructured and follow a roughly threshold crossings --- and different $\pclp$ can, in principle, still result
normal $\pci$ with constant mean and variance across $t$. In contrast, in similar $T_1$.
song-evoked $c_i(t)$ are highly
Noise-evoked $c_i(t)$ fulfill the stationary requirement because their $\pci$
is approximately a normal distribution with constant mean and variance across
$t$.
Each feature $f_i(t)$ approximately
quantifies the proportion of time where the respective kernel response $c_i(t)$
exceeds the threshold value $\thr$ within the averaging interval $\tlp$. The
songs of different species are represented by specific combinations of feature
values, which should preferably be as constant as possible during a song to
fasciliate recognition. The fundamental requirement for constant $f_i(t)$ is
that the time where $c_i(t)>\thr$ within $\tlp$ is the same for all $t$, which
is fulfilled if the distribution $\pci$ of
The
value of $f_i(t)$ is hence determined by $\thr$ with respect to the
distribution $\pci$ of $c_i(t)$.
$c_i(t)>\thr$.
The feature set is the final song representation along the model pathway and
constitutes the basis for song recognition. Each feature $f_i(t)$ results from
the thresholding of the respective kernel response $c_i(t)$ by $\nl$ and the
subsequent temporal averaging of binary response $b_i(t)$ by a lowpass filter
with cutoff frequency $\fc$, which specifies the averaging interval $\tlp$.
Feature $f_i(t)$ approximately quantifies the proportion of time during which
$c_i(t)$ exceeds the threshold value $\thr$ within $\tlp$. The value of
$f_i(t)$ at time point $t$ is hence determined by $\thr$ with respect to the
distribution $\pci$ of $c_i(t)$ around $t$ and restricted to the interval
$[0,1]$.
% Theoretical constraints for constant features:
The songs of different species are represented by specific combinations of
values across the feature set, which should preferably be constant for the
duration of a song to fasciliate recognition. The fundamental requirement for
constant $f_i(t)$ is that the time where $c_i(t)>\thr$ within $\tlp$ is the
same for all $t$.
This is fulfilled if $c_i(t)$ is stationary across $t$ and
$\tlp$ is much longer than the relevant time scales of $c_i(t)$, so that $\pci$
is independent of $t$.
, which is fulfilled if $\pci$ is stable across $t$.
The most
straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is stationary
and $\tlp$ is sufficiently long to average over the stationary distribution of
The most
straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is periodic and
$\tlp$ is sufficiently long to average over multiple cycles of $c_i(t)$.
Most song-evoked $c_i(t)$ are indeed highly repetitive, albeit not perfectly Most song-evoked $c_i(t)$ are indeed highly repetitive, albeit not perfectly
periodic, which is largely an inherited property of the song itself. periodic, which is largely an inherited property of the song itself.