Progress with cleaning up "IntInv vs SNR".

This commit is contained in:
j-hartling
2026-06-15 18:20:33 +02:00
parent 67690b97f7
commit 21e6ab4d64
2 changed files with 83 additions and 64 deletions

BIN
main.pdf

Binary file not shown.

147
main.tex
View File

@@ -1595,7 +1595,7 @@ does not change substantially within $\tstat$.
% Constraints on the song structure: % Constraints on the song structure:
% Also: Constant model features vs. actual grasshopper (calling) songs: % Also: Constant model features vs. actual grasshopper (calling) songs:
% (Also: Third revision and this section still doesn't sound good) % (Also: Third revision and still far from done and good)
Grasshoppers sing by pulling the stridulatory file on the hindlegs across a Grasshoppers sing by pulling the stridulatory file on the hindlegs across a
resonating vein on the forewings~(\bcite{helversen1977stridulatory}; resonating vein on the forewings~(\bcite{helversen1977stridulatory};
\bcite{stumpner1994song}; \bcite{helversen1997recognition}). Different \bcite{stumpner1994song}; \bcite{helversen1997recognition}). Different
@@ -1660,6 +1660,7 @@ as soon as $f_i(t)$ is within tolerance or wait for $f_i(t)$ to stabilize for
additional certainty. additional certainty.
\subsection{Invariant processing in the grasshopper auditory system} \subsection{Invariant processing in the grasshopper auditory system}
\label{sec:general_inv}
% Invariance in the general (systemic) sense: % Invariance in the general (systemic) sense:
The notion of invariance is fundamental for sensory processing systems. The notion of invariance is fundamental for sensory processing systems.
@@ -1710,45 +1711,64 @@ time scale-selectivity is reflected by the cutoff frequency $\fc$ of the
highpass filter that underlies the adaptation of $\adapt(t)$: Most $\fc$ except highpass filter that underlies the adaptation of $\adapt(t)$: Most $\fc$ except
the lowest ones are effective in removing the local offset of $\db(t)$ and the lowest ones are effective in removing the local offset of $\db(t)$ and
render $\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$ render $\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$
preserve the relevant amplitude dynamics of the song pattern. Intensity preserve the relevant amplitude dynamics of the song pattern. The time scale of
invariance by thresholding and temporal averaging also has a relevant time intensity invariance by thresholding and temporal averaging is determined by
scale, which is determined by the averaging interval $\tlp$. However, this time the averaging interval $\tlp$. However, unlike $\fc$, $\tlp$ is not constrained
scale is not constrained by the need to preserve the temporal structure of the by the need to preserve the song pattern but rather to provide a suitable
song pattern but to provide a suitable degree of temporal integration across degree of temporal integration~(Section\,\ref{sec:constant_feat}).
the song pattern~(Section\,\ref{sec:constant_feat}).
\subsection{Intensity invariance versus SNR} \subsection{Intensity invariance versus SNR along the model pathway}
Each processing step along the model pathway is a transformation between input % % Establishing the principle trade-off (should maybe come later?):
representation and output representation. The intensity of the input is % The output of a transformation is considered to be intensity-invariant if its
characterized by scale $\sca$. The intensity of the output is characterized by % intensity measure saturates for sufficiently large scales $\sca$, which in turn
an appropriate intensity measure. If the transformation renders the output more % caps the output SNR to a constant value across these $\sca$. Otherwise, the
intensity-invariant, then the intensity measure will saturate for sufficiently % output SNR will increase monotonically with $\sca$. The trade-off between
large $\sca$, which caps the output SNR to a constant value across these % intensity invariance and SNR refers to the principle that a transformation can
$\sca$. Otherwise, the intensity measure and hence the output SNR will increase % either improve intensity invariance or maintain SNR --- it cannot do both at
monotonically with $\sca$. The trade-off between intensity invariance and SNR % the same time. This principle is most likely not specific to the two mechanisms
refers to the principle that a transformation can either improve intensity % along the model pathway but rather a general property of transformations that
invariance or maintain SNR --- it cannot do both at the same time. This % equalize between different input intensities.
principle is presumably not specific to the two mechanisms along the model
pathway but rather a general property of transformations that equalize between
different input intensities.
Logarithmic compression and adaptation by highpass filtering is capable of % Building a sufficient SNR "buffer":
equalizing a wide range of $\sca$. In the absence of noise component $\noc(t)$, A stridulating grasshopper generates a song with a specific initial intensity,
output $\adapt(t)$ is a perfectly intensity-invariant representation of song which is steadily attenuated as the song propagates through the
component $\soc(t)$ across all $\sca>0$. However, the presence of $\noc(t)$ environment~(\bcite{michelsen1978sound}). A listening grasshopper receives a
limits the effectiveness of this mechanism to sufficiently large $\sca$. This sound signal $\raw(t)$, which is a mixture of the song component $\soc(t)$ with
means that intensity invariance and SNR interact at the input level, as well. scale $\sca$ and the environmental noise component $\noc(t)$. The greater the
Specifically, the saturation point of $\adapt(t)$ is determined by the input distance between sender and receiver, the smaller $\sca$ and hence the lower
SNR of $\env(t)$, which in turn depends on the initial SNR of the sound signal the SNR of $\raw(t)$ at the position of the receiver. The tympanal bandpass
$\raw(t)$. This initial SNR is presumably improved by the bandpass filtering of filtering of $\raw(t)$ into $\filt(t)$ likely improves the SNR by attenuating
$\raw(t)$ into $\filt(t)$ at the tympanal membrane, which attenuates frequencies outside the relevant range of grasshopper songs. The SNR is further
frequencies outside the relevant range of grasshopper songs. The SNR is then improved by the rectification and lowpass filtering of $\filt(t)$ into
further improved by the rectification and lowpass filtering of $\filt(t)$ into $\env(t)$. The lower the cutoff frequency $\fc$ of the lowpass filter, the
$\env(t)$. This improvement depends on the cutoff frequency $\fc$ of the higher the SNR of $\env(t)$ at a given $\sca$, although $\fc$ must also be
lowpass filter --- the lower $\fc$, the higher the SNR of $\env(t)$ at a given sufficiently high to preserve the amplitude dynamics of the song pattern.
$\sca$. However, $\fc$ must not be too low to avoid the attenuation of relevant Overall, the first processing steps along the pathway are not designed to
amplitude dynamics of the song pattern. The saturation level of $\adapt$, achieve intensity invariance but rather to improve the SNR of the song
representation beyond the initial SNR of $\raw(t)$.
The first mechanism of intensity invariance consists of logarithmic compression
and adaptation of $\env(t)$ into $\adapt(t)$. In the absence of $\noc(t)$,
$\adapt(t)$ is a perfectly intensity-invariant representation of $\soc(t)$. In
the presence of $\noc(t)$, $\adapt(t)$ is intensity-invariant only for a
sufficiently high SNR of $\env(t)$. The preceeding SNR improvements from
$\raw(t)$ to $\env(t)$ thus serve to improve the intensity invariance of
$\adapt(t)$ by shifting the saturation point towards lower $\sca$. However,
this effect is limited --- if the SNR of $\raw(t)$ at the receiver's position
does not allow for a sufficiently high SNR of $\env(t)$, $\adapt(t)$ will not
be intensity-invariant. The initial song intensity that the sender can achieve
therefore determines the distance at which $\adapt(t)$ is intensity-invariant
to the receiver.
Assuming that intensity invariance of $\adapt(t)$ is required for reliable song
recognition,
This might be a reason why robustness to noise masking is an
attractive property of male calling songs~(\bcite{einhaupl2011attractiveness}).
The saturation level of $\adapt$,
unlike its saturation point, is independent of the SNR of $\env(t)$ because the unlike its saturation point, is independent of the SNR of $\env(t)$ because the
influence of $\noc(t)$ is negligible for sufficiently large $\sca$. The output influence of $\noc(t)$ is negligible for sufficiently large $\sca$. The output
SNR of $\adapt(t)$ saturates at a comparably low value of around 10. This might SNR of $\adapt(t)$ saturates at a comparably low value of around 10. This might
@@ -1798,35 +1818,34 @@ the saturation level of $f_i(t)$ will be determined by the second mechanism.
The saturation points of $f_i(t)$ across the set are distributed over a much The saturation points of $f_i(t)$ across the set are distributed over a much
wider range than those of the preceeding kernel responses $c_i(t)$, which wider range than those of the preceeding kernel responses $c_i(t)$, which
suggests that the interaction between the two mechanisms is specific to suggests that the interaction between the two mechanisms is specific to
individual kernels $k_i(t)$. A number of $f_i(t)$ achieve a lower saturation individual kernels. A number of $f_i(t)$ achieve a lower saturation point than
point than the respective $c_i(t)$, while some $f_i(t)$ exhibit similar or only the respective $c_i(t)$, whereas some $f_i(t)$ exhibit similar or only
marginally lower saturation points. This raises the question whether two marginally lower saturation points. In these cases, the question arises to what
consecutive mechanisms of intensity invariance are actually beneficial for the extent two consecutive mechanisms of intensity invariance are actually
overall system. beneficial for the overall system.
Various grasshopper species, especially those with longer songs like \textit{C. From a computational perspective, the answer could be that logarithmic
mollis}, \textit{G. rufus}, or \textit{O. rufipes}, tend to stridulate softly compression and adaptation is a necessary preprocessing step towards robust
at first and then continuously increase the amplitude of their song over time. $f_i(t)$ because it works towards a more consistent distribution $\pci$ of
This slow "ramping" amplitude modulation makes the overall song less periodic $c_i(t)$. If $\pci$ is consistent between different songs of the same species,
despite its temporal regularity. The "ramping" appears more pronounced in a static threshold value $\thr$ is sufficient to generate a consistent
$\env(t)$ compared to $\adapt(t)$, which suggests that the logarithmic species-specific feature representation. If $\pci$ is consistent over the
compression and adaptation during the preprocessing stage might be at least course of a song, $f_i(t)$ is constant throughout the song, which extends the
partially beneficial for mitigating the effect of this amplitude modulation on time window for reliable recognition~(Section\,\ref{sec:constant_feat}).
later representations. However, the adaptation of $\adapt(t)$ can only act on
certain time scales --- depending on the cutoff frequency of the underlying
highpass filter --- and is hence not able to compensate for "ramping" across
the entire duration of a song.
From a purely functional perspective, the answer could be that logarithmic
compression and adaptation is a necessary preprocessing step towards a robust First, the preprocessing results in a more consistent
feature representation, even if thresholding and temporal averaging alone would distribution $\pci$ of $c_i(t)$ between songs of different intensity and in
be sufficient to render $f_i(t)$ intensity-invariant. This preprocessing likely turn allows for the generation of consistent $f_i(t)$ under a static threshold
improves the temporal regularity of the song pattern in $\adapt(t)$ and value $\thr$. Second, this preprocessing improves the temporal regularity of
$c_i(t)$, which is required for constant $f_i(t)$ across the duration of a the song pattern by mitigating the slow "ramping" amplitude modulation that is
song~(Section\,\ref{sec:constant_feat}). It also ensures consistency between common to many grasshopper songs.
the distribution $\pci$ of $c_i(t)$ across songs of different intensity, which
is essential for the generation of consistent species-specific $f_i(t)$ under a This preprocessing likely improves the temporal regularity of the song pattern
static $\thr$. From a physiological perspective, the answer is likely that in $\adapt(t)$ and $c_i(t)$, which is required for constant $f_i(t)$ across the
duration of a song~(Section\,\ref{sec:constant_feat}).
From a physiological perspective, the answer is likely that
neurons possess only a limited firing rate for encoding stimulus intensities neurons possess only a limited firing rate for encoding stimulus intensities
that can range over several orders of magnitude. Sigmoidal tuning curves over that can range over several orders of magnitude. Sigmoidal tuning curves over
logarithmically compressed stimulus intensities are a common property of logarithmically compressed stimulus intensities are a common property of