Busy integrating more sources into discussion.

This commit is contained in:
j-hartling
2026-06-02 18:21:25 +02:00
parent dea5923dd7
commit d6ed3f1664
9 changed files with 205 additions and 103 deletions

253
main.tex
View File

@@ -167,8 +167,8 @@ Grasshopper songs are amplitude-modulated broad-band acoustic signals. They
consist of a series of noisy syllables and relatively quiet pauses, which form
a characteristic repetitive pattern~(\bcite{helversen1977stridulatory};
\bcite{stumpner1994song}). Song recognition depends on certain structural
parameters of this pattern --- such as the duration of syllables and
pauses~(\bcite{helversen1972gesang}), the slope of pulse
parameters of this pattern --- such as the ratio of syllable duration to pause
duration~(\bcite{helversen1972gesang}), the slope of pulse
onsets~(\bcite{helversen1993absolute}), and the accentuation of syllable onsets
relative to the preceeding pause~(\bcite{balakrishnan2001song};
\bcite{helversen2004acoustic}) --- which are sufficiently conveyed by the
@@ -1000,26 +1000,23 @@ corresponding averaging interval $\tlp$:
f(t)\,\approx\,\int_{\Theta}^{+\infty} \pclp\,dc\,=\,P(c\,>\,\Theta,\,\tlp)
\label{eq:feat_prop}
\end{equation}
In a sense, $f(t)$ can be interpreted as some sort of duty cycle with respect
to $\Theta$. For example, a feature value of $f(t)=0.4$ means that $c(t)$
exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$ around $t$.
In the most extreme cases, $\Theta$ lays either above the maximum of $c(t)$ or
below the minimum of $c(t)$, which results in a minimum or maximum possible
feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left column) or
$f(t)=1$, respectively.
In a sense, $f(t)$ can be interpreted as some sort of duty cycle of $c(t)$ with
respect to $\Theta$. For example, a feature value of $f(t)=0.4$ means that
$c(t)$ exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$
around $t$. In the most extreme cases, $\Theta$ lays either above the maximum
of $c(t)$ or below the minimum of $c(t)$, which results in a minimum or maximum
possible feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left
column) or $f(t)=1$, respectively.
Importantly, $f(t)$ neither retains information about the timing of individual
threshold crossings nor the precise values of $c(t)$ apart from their relation
to $\Theta$. Accordingly, for a given $\Theta$, different $\sca$ can still
result in similar $T_1$ segments (and hence similar feature values) depending
on the magnitude of the derivative of $c(t)$ in temporal proximity to time
points at which $c(t)$ crosses $\Theta$: The steeper the slope of $c(t)$, the
less $T_1$ changes with variations in $\sca$. The most reliable way of
exploiting this invariant porperty of $f(t)$ is to set $\Theta$ to a value near
0, because these values are least affected by different scales of $c(t)$. For
sufficiently large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in
both the noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e,
saturation regime).
to $\Theta$. Different $\sca$ can hence result in similar feature values by
producing similar $T_1$ segments. The most reliable way of exploiting this
invariant porperty of $f(t)$ is to set $\Theta$ to a value near 0, because
these values are least affected by different scales of $c(t)$. For sufficiently
large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in both the
noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation
regime).
The saturation level of $f(t)$ is independent of the precise value of $\Theta$,
but the saturation point decreases with
@@ -1526,52 +1523,55 @@ natural song variation.
\newpage
\section{Discussion}
% Recap of main findings:
In the current study, we have established a physiologically inspired functional
model of the grasshopper song recognition pathway. The model pathway covers the
entire auditory processing stream, from the sound reception at the tympanal
membrane over peripheral receptor neurons and local interneurons up to the
generation of a high-dimensional feature representation at the level of the
ascending neurons and beyond in the SEG. Using this model pathway, we have
model of the grasshopper song recognition pathway; from the sound reception at
the tympanal membrane over peripheral receptor neurons and local interneurons
to the generation of a high-dimensional feature representation at the level of
the ascending neurons and beyond in the SEG. Using this model pathway, we have
identified two computational key mechanisms for the emergence of
intensity-invariant song representations. Each mechanism comprises a nonlinear
transformation and a subsequent linear transformation. The first mechanism
consists of logarithmic compression and adaptation, which takes place at the
level of the receptor neurons and local interneurons. The second mechanism
consists of thresholding and temporal averaging, which takes place either at
the level of the ascending neurons or further downstream in the SEG. Systematic
investigation of both mechanisms revealed a persistent trade-off between the
intensity invariance and the SNR of the song representations along the pathway.
In the following, we discuss the capabilities and limitations of our model
approach as well as the implications of our findings for the design of the
grasshopper auditory system, the evolution of species-specific grasshopper
songs, and the ethological relevance of intensity invariance in a natural
acoustic environment.
consists of logarithmic compression and adaptation by highpass filtering, which
takes place at the level of the receptor neurons and local interneurons. The
second mechanism consists of thresholding and temporal averaging by lowpass
filtering, which takes place either at the level of the ascending neurons or
further downstream in the SEG. Systematic investigation of both mechanisms
revealed a persistent trade-off between the intensity invariance and the SNR of
the song representations along the pathway. In the following, we briefly
reflect on the potential of functional modelling for research on sensory
systems. We then discuss the implications of our findings for the evolutionary
design of both the auditory system and the species-specific songs of
grasshoppers as well as the ethological relevance of intensity invariance in a
natural acoustic environment.
\subsection{Leveraging functional modelling to investigate sensory systems}
% Functional modelling is cool but bound to freeload on previous research:
Our understanding of sensory processing systems is based on the distributed
accumulation of anatomical, physiological, and ethological evidence. Functional
modelling provides a powerful tool to integrate the available fragments into a
coherent whole. It fasciliates systematic, reproducible investigations of
relevant parameters such as scale $\sca$ or threshold value $\thr$. Moreover,
it allows to address questions of broader scope by generalizing from concrete
relevant parameters, such as scale $\sca$ or threshold value $\thr$. It also
allows to address questions of broader scope by generalizing from concrete
evidence. For instance, the interaction between the two mechanisms of intensity
invariance is most assessible if both mechanisms can be treated as consecutive
stages along the pathway --- where the output of the first stage relates
directly to the input of the second stage --- rather than separate entities.
The model pathway also provides a general basis for comparing song
Moreover, the model pathway provides a general basis for comparing song
representations across different species without the need for species-specific
models. However, the potential of functional modelling for research on sensory
systems depends entirely on the amount of available knowledge about the system.
The grasshopper song recognition pathway is a comparably simple and very
well-understood system and is therefore a particularly suitable candidate for
functional modelling. Other sensory systems that are either more complex or
have not been subject to decades of study will likely not be suitable for this
approach yet.
models. However, the potential of functional modelling for research on a
sensory system depends entirely on the amount of available knowledge about the
system and its specific stimuli. The grasshopper song recognition pathway is a
comparably simple, extensively researched and hence well-understood sensory
system and is therefore a particularly suitable candidate for functional
modelling. Other sensory systems that are either more complex or have not been
subject to decades of study will likely not be suitable for this approach yet.
\subsection{Feature representation, temporal averaging, and song design}
\subsection{Song design, temporal averaging, and feature representation}
\label{sec:constant_feat}
% Recap of feature theory and relevant parameters:
The feature set is the final song representation along the model pathway and
constitutes the basis for song recognition. Each feature $f_i(t)$ results from
the thresholding of the respective kernel response $c_i(t)$ by $\nl$ and the
@@ -1582,24 +1582,73 @@ the threshold value $\thr$ within the averaging interval $\tlp$ specified by
$\fc$. The value of $f_i(t)$ is hence determined by $\thr$ with respect to the
distribution $\pci$ of $c_i(t)$ and is restricted to the interval $[0,1]$.
% Feature representation and the constraint of repetitive song structure:
Different species-specific songs are represented by different combinations of
feature values, which should preferably be constant for the duration of a song
to enable reliable recognition. The fundamental requirement for a constant
$f_i(t)$ is that the time where $c_i(t)>\thr$ during $\tlp$ is the same for all
$t$, which is fulfilled if $\pci$ is stable across $t$. The most
straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is periodic and
$\tlp$ is sufficiently long to average over multiple cycles of $c_i(t)$.
Song-evoked $c_i(t)$ are indeed approximately periodic, which is largely an
to fasciliate recognition. The fundamental requirement for constant $f_i(t)$ is
that the time where $c_i(t)>\thr$ during $\tlp$ is the same for all $t$, which
is fulfilled if $\pci$ is stable across $t$. The most straightforward way to
achieve a stable $\pci$ is that $c_i(t)$ is periodic and $\tlp$ is sufficiently
long to average over multiple cycles of $c_i(t)$. Most song-evoked $c_i(t)$ are
indeed highly repetitive, albeit not perfectly periodic, which is largely an
inherited property of the song itself. Most grasshopper songs are produced by
stridulation, which refers to the pulling of the serrated stridulatory file on
the hindlegs across a resonating vein on the
forewings~(\bcite{helversen1977stridulatory}; \bcite{stumpner1994song};
\bcite{helversen1997recognition}). Every "tooth" that strikes the vein
generates a brief sound pulse; multiple pulses make up a syllable; and the
repetition of syllables and pauses results in a pattern with a high degree of
temporal regularity. Accordingly, a robust feature representation in the sense
of constant $f_i(t)$ is tightly linked to the mechanism of sound production and
the temporal structure of the generated song.
\bcite{helversen1997recognition}). Every "peg" that strikes the vein generates
a brief sound pulse; multiple pulses make up a syllable; and the repetition of
syllables and pauses results in a pattern with a high degree of temporal
regularity. A repetitive motor pattern during stridulation hence lays the basis
for constant $f_i(t)$.
The second requirement for constant $f_i(t)$ is a suitable averaging interval
$\tlp$. The minimum $\tlp$ should encompass at least a few cycles of $c_i(t)$
to ensure a stable $\pci$. The maximum $\tlp$ should not exceed the duration of
the song to avoid the inclusion of noise. The duration of species-specific
grasshopper songs can range from a few hundred milliseconds~(e\,.g
\textit{Stethophyma grossum}) to well over a minute~(e\,.g. \textit{C.
mollis}), so that the optimal $\tlp$ likely differs between species. The longer
$\tlp$, the longer $f_i(t)$ takes to stabilize after the onset of the song,
which narrows the time window for reliable recognition.
\\What about \bcite{ronacher1998song}??
\\ $\rightarrow$ Answer might be \bcite{clemens2021sex}
If the basis for constant $f_i(t)$ is already laid
The basis for constant $f_i(t)$ is hence already
The basis for a robust feature representation in the sense
of constant $f_i(t)$ is hence already laid during the song production.
If the feature representation relies on a repetitive song pattern, one would
expect that grasshopper songs are evolutionary constrained to include such a
pattern.
If constant $f_i(t)$ rely on a repetitive song pattern and are benefitial for
reliable song recognition, one would expect that repetitiveness is a common
design principle of species-specific grasshopper songs.
, and if constant
$f_i(t)$ are required for reliable song recognition, then one would expect that
grasshopper songs are evolutionarily constrained to have such a repetitive
structure.
This is true for many species-specific calling songs but less for
courtship songs, which tend to have a more complex structure~()
If constant $f_i(t)$ rely on a repetitive song pattern and are benefitial for
song recognition, then one would expect that grasshopper songs are
evolutionarily constrained to have such a repetitive temporal structure.
From an evolutionary perspective, one would then expect that grasshopper songs
are evolutionarily constrained to have a repetitive temporal structure in order
to elicit a robust feature representation.
Various grasshopper species, especially those with longer songs like \textit{C.
mollis}, \textit{G. rufus}, or \textit{O. rufipes}, tend to stridulate softly
@@ -1627,22 +1676,11 @@ song could result in similar $\pci$ despite their different temporal structure,
which would allow for consistent $f_i(t)$ across the entire song. However, it
appears more likely that only one part of the song encodes species identity,
while the other part serves a different purpose such as fitness
advertisement~(SOURCE?).
advertisement~(\bcite{stumpner1992recognition}).
Finally, the question remains how the choice of an appropriate averaging
interval $\tlp$ depends on the duration and temporal structure of a song. The
minimum $\tlp$ should encompass at least a few cycles of $c_i(t)$ to ensure a
stable $\pci$ and hence a constant $f_i(t)$. The maximum $\tlp$ should not
exceed the duration of a song to avoid the inclusion of behaviorally irrelevant
information. The longer $\tlp$, the longer $f_i(t)$ takes to stabilize after
the onset and before the offset of a song, which narrows the time window for
reliable recognition. The duration of species-specific grasshopper songs can
range from a few hundred milliseconds (e\,.g \textit{Stethophyma grossum}) to
well over a minute (e\,.g. \textit{C. mollis}), so that the optimal $\tlp$ is
likely to differ between species.
\subsection{Sensory invariances in the grasshopper auditory system}
\subsection{Invariant processing in the grasshopper auditory system}
% Invariance in the general (systemic) sense:
The notion of invariance is fundamental for sensory processing systems.
Invariance, in the general sense, can be described as the property of a
transformation to maintain variation across certain meaningful input parameters
@@ -1651,46 +1689,52 @@ boils down to a selective input-output decorrelation that allows the system to
represent only those aspects of the stimulus that are behaviorally relevant to
the organism.
% "Easy" case - Throw away parameters that are not relevant:
The grasshopper auditory system has to deal with a number of sources of
non-informative song variation. For instance, the temporal structure of the
song pattern warps with temperature~(\bcite{skovmand1983song}). This also
affects certain structural parameters that are essential for song recognition,
mainly the duration of syllables and pauses. The auditory system can compensate
for this variation by reading out relative temporal relationships rather than
absolute time intervals~(\bcite{creutzig2009timescale};
\bcite{creutzig2010timescale}). The ratio of syllable duration to pause
duration is relatively constant across temperatures and has been shown to be
suitable for song recognition~(\bcite{helversen1972gesang}), so that there is
likely no need to retain any information about the absolute duration of
syllables and pauses.
song pattern warps with temperature~(\bcite{skovmand1983song}). The auditory
system can compensate for this time warping by reading out relative temporal
relationships, such as the ratio of syllable duration to pause duration, rather
than the absolute time intervals~(\bcite{creutzig2009timescale};
\bcite{creutzig2010timescale}). This allows for reliable song recognition
across different temperatures~(\bcite{helversen1972gesang}). Accordingly, the
auditory system does likely not retain any information about the precise
duration of syllables and pauses.
% Hard case - When a parameter is both relevant and irrelevant across functions:
The situation is more complex for variations in song intensity. Song intensity
at the receiver's position depends mostly on the distance to the sender and is
hence not a reliable cue to infer species identity. The auditory system should
therefore be invariant to intensity variations to recognize conspecific songs
therefore not a reliable cue to infer species identity. The auditory system
must hence be invariant to intensity variations to recognize conspecific songs
regardless of sender distance. However, song intensity --- specifically, the
interaural intensity difference --- is also required for directional hearing,
which is essential for phonotaxis~(\bcite{helversen1988interaural}). Conflicts
between song recognition and directional hearing are avoided in the auditory
system by distributing both functions across two parallel
interaural intensity difference --- is also a relevant cue for directional
hearing, which is essential for phonotaxis~(\bcite{helversen1988interaural}).
Interference between song recognition and directional hearing is avoided in the
auditory system by distributing both functions across two parallel
pathways~(\bcite{helversen1984parallel}; \bcite{ronacher1986routes}). This is
the main reason why our model pathway is focused entirely on song recognition
and has no capacity for directional hearing, no matter how relevant it may be
to the grasshopper.
and has no capacity for directional hearing, even though it is crucial to the
grasshopper's behavior.
Furthermore, "invariance to variations in song intensity" does not do justice
to the full extent of the problem. Intensity is a function of song amplitude
within a certain time frame. It can refer to the individual syllables and
pauses of the song pattern as well as the entire song --- the former is
relevant for song recognition, while the latter is not. Intensity invariance in
the current context can therefore be described as time scale-selective
sensitivity to the faster amplitude dynamics of the song pattern and
simultaneous insensitivity to slower, more sustained amplitude dynamics. In the
model pathway, this time scale selectivity is reflected by the cutoff frequency
$\fc$ of the highpass filter that underlies the adaptation of $\adapt(t)$: Most
$\fc$ are effective in removing the local offset of $\db(t)$ and render
$\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$ will leave the
relevant amplitude dynamics of the song pattern intact.
% Hard case+ - When a parameter is both relevant and irrelevant within a function:
Song intensity is a function of the song amplitudes within a certain time
frame. "Invariance to variations in song intensity" is hence entirely a matter
of time scales. It can refer to intensity variations across different
songs~(longer time scales) or intensity variations across the syllables within
a song~(shorter time scales), but also to the intensity difference that
differentiates a syllable from a pause~(very short time scales). The time scale
of intensity invariance must therefore be sufficiently long to leave the
syllables and pauses of the song pattern intact. In the model pathway, this
time scale-selectivity is reflected by the cutoff frequency $\fc$ of the
highpass filter that underlies the adaptation of $\adapt(t)$: Most $\fc$ except
the lowest ones are effective in removing the local offset of $\db(t)$ and
render $\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$
preserve the relevant amplitude dynamics of the song pattern. Intensity
invariance by thresholding and temporal averaging also has a relevant time
scale, which is determined by the averaging interval $\tlp$. However, this time
scale is not constrained by the need to preserve the temporal structure of the
song pattern but to provide a suitable degree of temporal integration across
the song pattern~(Section\,\ref{sec:constant_feat}).
\subsection{Intensity invariance versus SNR}
@@ -1795,6 +1839,7 @@ logarithmically compressed stimulus intensities are a common property of
sensory neurons across various modalities~(SOURCE?), and neurons of the
grasshopper auditory system are no exception~(\bcite{suga1960peripheral};
\bcite{gollisch2002energy}).
\\$\rightarrow$ \bcite{ronacher2004neuronal}
\subsection{Implications for behavior in a natural acoustic environment}
@@ -1820,6 +1865,8 @@ all nearby individuals. Importantly, the limitation of intensity invariance by
SNR likely applies to all grasshoppers regardless of species, so that the
behavioral strategies could be shared among the species that coexist in a given
habitat.
\\ \bcite{kramer2018robustness}
\\ \bcite{einhaupl2011attractiveness}
% Because the presumed restriction of song recognition
% by means of the noise floor applies to all grasshoppers in a certain area,