Good progress with cleaning up "IntInv vs SNR". Gonna merge with "IntInv vs IntInv" tomorrow.

This commit is contained in:
j-hartling
2026-06-16 18:23:14 +02:00
parent 21e6ab4d64
commit 395cfb98ab
2 changed files with 35 additions and 38 deletions

BIN
main.pdf

Binary file not shown.

View File

@@ -1719,18 +1719,7 @@ degree of temporal integration~(Section\,\ref{sec:constant_feat}).
\subsection{Intensity invariance versus SNR along the model pathway}
% % Establishing the principle trade-off (should maybe come later?):
% The output of a transformation is considered to be intensity-invariant if its
% intensity measure saturates for sufficiently large scales $\sca$, which in turn
% caps the output SNR to a constant value across these $\sca$. Otherwise, the
% output SNR will increase monotonically with $\sca$. The trade-off between
% intensity invariance and SNR refers to the principle that a transformation can
% either improve intensity invariance or maintain SNR --- it cannot do both at
% the same time. This principle is most likely not specific to the two mechanisms
% along the model pathway but rather a general property of transformations that
% equalize between different input intensities.
% Building a sufficient SNR "buffer":
% Building a sufficiently large SNR "buffer":
A stridulating grasshopper generates a song with a specific initial intensity,
which is steadily attenuated as the song propagates through the
environment~(\bcite{michelsen1978sound}). A listening grasshopper receives a
@@ -1742,12 +1731,12 @@ filtering of $\raw(t)$ into $\filt(t)$ likely improves the SNR by attenuating
frequencies outside the relevant range of grasshopper songs. The SNR is further
improved by the rectification and lowpass filtering of $\filt(t)$ into
$\env(t)$. The lower the cutoff frequency $\fc$ of the lowpass filter, the
higher the SNR of $\env(t)$ at a given $\sca$, although $\fc$ must also be
sufficiently high to preserve the amplitude dynamics of the song pattern.
Overall, the first processing steps along the pathway are not designed to
achieve intensity invariance but rather to improve the SNR of the song
representation beyond the initial SNR of $\raw(t)$.
higher the SNR of $\env(t)$ for a given $\sca$, although $\fc$ must also be
sufficiently high to preserve the amplitude dynamics of the song pattern. The
first processing steps along the pathway are hence designed to improve the SNR
of the song representation beyond the initial SNR of $\raw(t)$.
% Dependence of log-HP intensity invariance on sufficient SNR (+implications):
The first mechanism of intensity invariance consists of logarithmic compression
and adaptation of $\env(t)$ into $\adapt(t)$. In the absence of $\noc(t)$,
$\adapt(t)$ is a perfectly intensity-invariant representation of $\soc(t)$. In
@@ -1757,28 +1746,36 @@ $\raw(t)$ to $\env(t)$ thus serve to improve the intensity invariance of
$\adapt(t)$ by shifting the saturation point towards lower $\sca$. However,
this effect is limited --- if the SNR of $\raw(t)$ at the receiver's position
does not allow for a sufficiently high SNR of $\env(t)$, $\adapt(t)$ will not
be intensity-invariant. The initial song intensity that the sender can achieve
therefore determines the distance at which $\adapt(t)$ is intensity-invariant
to the receiver.
be intensity-invariant. In this case, the receiver is presumably less likely to
recognize $\raw(t)$ as a conspecific song. The limitation of the intensity
invariance of $\adapt(t)$ by the SNR of $\raw(t)$ might hence at least in parts
be responsible for the limited maximum distance at which song recognition is
possible~(\bcite{lang2000acoustic}) and the selection towards song patterns
that are robust to noise masking~(\bcite{einhaupl2011attractiveness}).
Assuming that intensity invariance of $\adapt(t)$ is required for reliable song
recognition,
This might be a reason why robustness to noise masking is an
attractive property of male calling songs~(\bcite{einhaupl2011attractiveness}).
The saturation level of $\adapt$,
unlike its saturation point, is independent of the SNR of $\env(t)$ because the
influence of $\noc(t)$ is negligible for sufficiently large $\sca$. The output
SNR of $\adapt(t)$ saturates at a comparably low value of around 10. This might
in parts be a consequence of the logarithm, which compresses different higher
intensities but also amplifies lower intensities, including the noise floor.
Both the saturation level and the saturation point of $\adapt(t)$ vary between
different species and individual songs. These differences are likely rooted in
the way in which logarithmic compression acts on the specific distribution of
$\env(t)$, which is determined by $\fc$ as well as the temporal structure and
frequency spectrum of the rectified $\filt(t)$.
% Trading SNR for log-HP intensity invariance (+variability, +general principle):
The SNR of each song representation prior to $\adapt(t)$ increases
monotonically with $\sca$~(excluding $0<\sca\ll1$, noise regime). These
representations maintain and improve the initial SNR of $\raw(t)$ and hence
never achieve intensity invariance. In contrast, the SNR of the
intensity-invariant $\adapt(t)$ never exceeds its saturation level even for
arbitrarily high $\sca$. The saturation level of $\adapt(t)$ varies across
species and songs. This variability is likely rooted in the way in which
logarithmic compression acts on the specific distribution of $\env(t)$, which
depends on the $\fc$ of the lowpass filter as well as the temporal structure
and frequency spectrum of the rectified $\filt(t)$. Overall, $\adapt(t)$ has
never been observed to exceed a SNR of around~10 across all songs. The low SNR
of $\adapt(t)$ partially results from the amplification of smaller values of
$\env(t)$ by the logarithm, which raises the noise floor of $\adapt(t)$. Still,
the reduction in SNR is substantial --- considering that the SNR of preceeding
song representations has been orders of magnitude higher --- but is likely a
necessary price to pay for the intensity invariance of $\adapt(t)$. After all,
a transformation cannot compress a range of different input intensities into a
constant output intensity without sacrificing some of the corresponding input
SNR. Accordingly, the trade-off between intensity invariance and SNR is not
expected to be specific to the particular mechanisms along the pathway but
presumably applies to any transformation that achieves or improves intensity
invariance.
Thresholding and temporal averaging renders feature $f_i(t)$
intensity-invariant for sufficiently large $\sca$. The trade-off between