Busy integrating more sources into discussion.

2026-06-02 18:21:25 +02:00
parent dea5923dd7
commit d6ed3f1664
9 changed files with 205 additions and 103 deletions
--- a/main.tex
+++ b/main.tex
@@ -167,8 +167,8 @@ Grasshopper songs are amplitude-modulated broad-band acoustic signals. They
 consist of a series of noisy syllables and relatively quiet pauses, which form
 a characteristic repetitive pattern~(\bcite{helversen1977stridulatory};
 \bcite{stumpner1994song}). Song recognition depends on certain structural
-parameters of this pattern --- such as the duration of syllables and
-pauses~(\bcite{helversen1972gesang}), the slope of pulse
+parameters of this pattern --- such as the ratio of syllable duration to pause
+duration~(\bcite{helversen1972gesang}), the slope of pulse
 onsets~(\bcite{helversen1993absolute}), and the accentuation of syllable onsets
 relative to the preceeding pause~(\bcite{balakrishnan2001song};
 \bcite{helversen2004acoustic}) --- which are sufficiently conveyed by the
@@ -1000,26 +1000,23 @@ corresponding averaging interval $\tlp$:
    f(t)\,\approx\,\int_{\Theta}^{+\infty} \pclp\,dc\,=\,P(c\,>\,\Theta,\,\tlp)
    \label{eq:feat_prop}
 \end{equation}
-In a sense, $f(t)$ can be interpreted as some sort of duty cycle with respect
-to $\Theta$. For example, a feature value of $f(t)=0.4$ means that $c(t)$
-exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$ around $t$.
-In the most extreme cases, $\Theta$ lays either above the maximum of $c(t)$ or
-below the minimum of $c(t)$, which results in a minimum or maximum possible
-feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left column) or
-$f(t)=1$, respectively.
+In a sense, $f(t)$ can be interpreted as some sort of duty cycle of $c(t)$ with
+respect to $\Theta$. For example, a feature value of $f(t)=0.4$ means that
+$c(t)$ exceeds $\Theta$ for approximately 40\,\% of the time within $\tlp$
+around $t$. In the most extreme cases, $\Theta$ lays either above the maximum
+of $c(t)$ or below the minimum of $c(t)$, which results in a minimum or maximum
+possible feature value of $f(t)=0$~(Fig.\,\ref{fig:thresh-lp_single}d, left
+column) or $f(t)=1$, respectively.

 Importantly, $f(t)$ neither retains information about the timing of individual
 threshold crossings nor the precise values of $c(t)$ apart from their relation
-to $\Theta$. Accordingly, for a given $\Theta$, different $\sca$ can still
-result in similar $T_1$ segments (and hence similar feature values) depending
-on the magnitude of the derivative of $c(t)$ in temporal proximity to time
-points at which $c(t)$ crosses $\Theta$: The steeper the slope of $c(t)$, the
-less $T_1$ changes with variations in $\sca$. The most reliable way of
-exploiting this invariant porperty of $f(t)$ is to set $\Theta$ to a value near
-0, because these values are least affected by different scales of $c(t)$. For
-sufficiently large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in
-both the noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e,
-saturation regime).
+to $\Theta$. Different $\sca$ can hence result in similar feature values by
+producing similar $T_1$ segments. The most reliable way of exploiting this
+invariant porperty of $f(t)$ is to set $\Theta$ to a value near 0, because
+these values are least affected by different scales of $c(t)$. For sufficiently
+large $\sca$, $f(t)$ then approaches the same constant $\mu_f$ in both the
+noiseless and the noisy case~(Fig.\,\ref{fig:thresh-lp_single}e, saturation
+regime).

 The saturation level of $f(t)$ is independent of the precise value of $\Theta$,
 but the saturation point decreases with
@@ -1526,52 +1523,55 @@ natural song variation.
 \newpage
 \section{Discussion}

+% Recap of main findings:
 In the current study, we have established a physiologically inspired functional
-model of the grasshopper song recognition pathway. The model pathway covers the
-entire auditory processing stream, from the sound reception at the tympanal
-membrane over peripheral receptor neurons and local interneurons up to the
-generation of a high-dimensional feature representation at the level of the
-ascending neurons and beyond in the SEG. Using this model pathway, we have
+model of the grasshopper song recognition pathway; from the sound reception at
+the tympanal membrane over peripheral receptor neurons and local interneurons
+to the generation of a high-dimensional feature representation at the level of
+the ascending neurons and beyond in the SEG. Using this model pathway, we have
 identified two computational key mechanisms for the emergence of
 intensity-invariant song representations. Each mechanism comprises a nonlinear
 transformation and a subsequent linear transformation. The first mechanism
-consists of logarithmic compression and adaptation, which takes place at the
-level of the receptor neurons and local interneurons. The second mechanism
-consists of thresholding and temporal averaging, which takes place either at
-the level of the ascending neurons or further downstream in the SEG. Systematic
-investigation of both mechanisms revealed a persistent trade-off between the
-intensity invariance and the SNR of the song representations along the pathway.
-In the following, we discuss the capabilities and limitations of our model
-approach as well as the implications of our findings for the design of the
-grasshopper auditory system, the evolution of species-specific grasshopper
-songs, and the ethological relevance of intensity invariance in a natural
-acoustic environment.
+consists of logarithmic compression and adaptation by highpass filtering, which
+takes place at the level of the receptor neurons and local interneurons. The
+second mechanism consists of thresholding and temporal averaging by lowpass
+filtering, which takes place either at the level of the ascending neurons or
+further downstream in the SEG. Systematic investigation of both mechanisms
+revealed a persistent trade-off between the intensity invariance and the SNR of
+the song representations along the pathway. In the following, we briefly
+reflect on the potential of functional modelling for research on sensory
+systems. We then discuss the implications of our findings for the evolutionary
+design of both the auditory system and the species-specific songs of
+grasshoppers as well as the ethological relevance of intensity invariance in a
+natural acoustic environment.

 \subsection{Leveraging functional modelling to investigate sensory systems}

+% Functional modelling is cool but bound to freeload on previous research:
 Our understanding of sensory processing systems is based on the distributed
 accumulation of anatomical, physiological, and ethological evidence. Functional
 modelling provides a powerful tool to integrate the available fragments into a
 coherent whole. It fasciliates systematic, reproducible investigations of
-relevant parameters such as scale $\sca$ or threshold value $\thr$. Moreover,
-it allows to address questions of broader scope by generalizing from concrete
+relevant parameters, such as scale $\sca$ or threshold value $\thr$. It also
+allows to address questions of broader scope by generalizing from concrete
 evidence. For instance, the interaction between the two mechanisms of intensity
 invariance is most assessible if both mechanisms can be treated as consecutive
 stages along the pathway --- where the output of the first stage relates
 directly to the input of the second stage --- rather than separate entities.
-The model pathway also provides a general basis for comparing song
+Moreover, the model pathway provides a general basis for comparing song
 representations across different species without the need for species-specific
-models. However, the potential of functional modelling for research on sensory
-systems depends entirely on the amount of available knowledge about the system.
-The grasshopper song recognition pathway is a comparably simple and very
-well-understood system and is therefore a particularly suitable candidate for
-functional modelling. Other sensory systems that are either more complex or
-have not been subject to decades of study will likely not be suitable for this
-approach yet.
+models. However, the potential of functional modelling for research on a
+sensory system depends entirely on the amount of available knowledge about the
+system and its specific stimuli. The grasshopper song recognition pathway is a
+comparably simple, extensively researched and hence well-understood sensory
+system and is therefore a particularly suitable candidate for functional
+modelling. Other sensory systems that are either more complex or have not been
+subject to decades of study will likely not be suitable for this approach yet.

-\subsection{Feature representation, temporal averaging, and song design}
+\subsection{Song design, temporal averaging, and feature representation}
 \label{sec:constant_feat}

+% Recap of feature theory and relevant parameters:
 The feature set is the final song representation along the model pathway and
 constitutes the basis for song recognition. Each feature $f_i(t)$ results from
 the thresholding of the respective kernel response $c_i(t)$ by $\nl$ and the
@@ -1582,24 +1582,73 @@ the threshold value $\thr$ within the averaging interval $\tlp$ specified by
 $\fc$. The value of $f_i(t)$ is hence determined by $\thr$ with respect to the
 distribution $\pci$ of $c_i(t)$ and is restricted to the interval $[0,1]$.

+% Feature representation and the constraint of repetitive song structure:
 Different species-specific songs are represented by different combinations of
 feature values, which should preferably be constant for the duration of a song
-to enable reliable recognition. The fundamental requirement for a constant
-$f_i(t)$ is that the time where $c_i(t)>\thr$ during $\tlp$ is the same for all
-$t$, which is fulfilled if $\pci$ is stable across $t$. The most
-straightforward way to achieve a stable $\pci$ is that $c_i(t)$ is periodic and
-$\tlp$ is sufficiently long to average over multiple cycles of $c_i(t)$.
-Song-evoked $c_i(t)$ are indeed approximately periodic, which is largely an
+to fasciliate recognition. The fundamental requirement for constant $f_i(t)$ is
+that the time where $c_i(t)>\thr$ during $\tlp$ is the same for all $t$, which
+is fulfilled if $\pci$ is stable across $t$. The most straightforward way to
+achieve a stable $\pci$ is that $c_i(t)$ is periodic and $\tlp$ is sufficiently
+long to average over multiple cycles of $c_i(t)$. Most song-evoked $c_i(t)$ are
+indeed highly repetitive, albeit not perfectly periodic, which is largely an
 inherited property of the song itself. Most grasshopper songs are produced by
 stridulation, which refers to the pulling of the serrated stridulatory file on
 the hindlegs across a resonating vein on the
 forewings~(\bcite{helversen1977stridulatory}; \bcite{stumpner1994song};
-\bcite{helversen1997recognition}). Every "tooth" that strikes the vein
-generates a brief sound pulse; multiple pulses make up a syllable; and the
-repetition of syllables and pauses results in a pattern with a high degree of
-temporal regularity. Accordingly, a robust feature representation in the sense
-of constant $f_i(t)$ is tightly linked to the mechanism of sound production and
-the temporal structure of the generated song.
+\bcite{helversen1997recognition}). Every "peg" that strikes the vein generates
+a brief sound pulse; multiple pulses make up a syllable; and the repetition of
+syllables and pauses results in a pattern with a high degree of temporal
+regularity. A repetitive motor pattern during stridulation hence lays the basis
+for constant $f_i(t)$.
+
+The second requirement for constant $f_i(t)$ is a suitable averaging interval
+$\tlp$. The minimum $\tlp$ should encompass at least a few cycles of $c_i(t)$
+to ensure a stable $\pci$. The maximum $\tlp$ should not exceed the duration of
+the song to avoid the inclusion of noise. The duration of species-specific
+grasshopper songs can range from a few hundred milliseconds~(e\,.g
+\textit{Stethophyma grossum}) to well over a minute~(e\,.g. \textit{C.
+mollis}), so that the optimal $\tlp$ likely differs between species. The longer
+$\tlp$, the longer $f_i(t)$ takes to stabilize after the onset of the song,
+which narrows the time window for reliable recognition.
+\\What about \bcite{ronacher1998song}??
+\\ $\rightarrow$ Answer might be \bcite{clemens2021sex}
+
+
+
+
+If the basis for constant $f_i(t)$ is already laid 
+
+The basis for constant $f_i(t)$ is hence already 
+
+The basis for a robust feature representation in the sense
+of constant $f_i(t)$ is hence already laid during the song production.
+
+If the feature representation relies on a repetitive song pattern, one would
+expect that grasshopper songs are evolutionary constrained to include such a
+pattern.
+
+If constant $f_i(t)$ rely on a repetitive song pattern and are benefitial for
+reliable song recognition, one would expect that repetitiveness is a common
+design principle of species-specific grasshopper songs.
+
+, and if constant
+$f_i(t)$ are required for reliable song recognition, then one would expect that
+
+
+grasshopper songs are evolutionarily constrained to have such a repetitive
+structure.
+
+This is true for many species-specific calling songs but less for
+courtship songs, which tend to have a more complex structure~()
+
+If constant $f_i(t)$ rely on a repetitive song pattern and are benefitial for
+song recognition, then one would expect that grasshopper songs are
+evolutionarily constrained to have such a repetitive temporal structure.
+
+From an evolutionary perspective, one would then expect that grasshopper songs
+are evolutionarily constrained to have a repetitive temporal structure in order
+to elicit a robust feature representation.
+

 Various grasshopper species, especially those with longer songs like \textit{C.
 mollis}, \textit{G. rufus}, or \textit{O. rufipes}, tend to stridulate softly
@@ -1627,22 +1676,11 @@ song could result in similar $\pci$ despite their different temporal structure,
 which would allow for consistent $f_i(t)$ across the entire song. However, it
 appears more likely that only one part of the song encodes species identity,
 while the other part serves a different purpose such as fitness
-advertisement~(SOURCE?).
+advertisement~(\bcite{stumpner1992recognition}).

-Finally, the question remains how the choice of an appropriate averaging
-interval $\tlp$ depends on the duration and temporal structure of a song. The
-minimum $\tlp$ should encompass at least a few cycles of $c_i(t)$ to ensure a
-stable $\pci$ and hence a constant $f_i(t)$. The maximum $\tlp$ should not
-exceed the duration of a song to avoid the inclusion of behaviorally irrelevant
-information. The longer $\tlp$, the longer $f_i(t)$ takes to stabilize after
-the onset and before the offset of a song, which narrows the time window for
-reliable recognition. The duration of species-specific grasshopper songs can
-range from a few hundred milliseconds (e\,.g \textit{Stethophyma grossum}) to
-well over a minute (e\,.g. \textit{C. mollis}), so that the optimal $\tlp$ is
-likely to differ between species.
-
-\subsection{Sensory invariances in the grasshopper auditory system}
+\subsection{Invariant processing in the grasshopper auditory system}

+% Invariance in the general (systemic) sense:
 The notion of invariance is fundamental for sensory processing systems.
 Invariance, in the general sense, can be described as the property of a
 transformation to maintain variation across certain meaningful input parameters
@@ -1651,46 +1689,52 @@ boils down to a selective input-output decorrelation that allows the system to
 represent only those aspects of the stimulus that are behaviorally relevant to
 the organism.

+% "Easy" case - Throw away parameters that are not relevant:
 The grasshopper auditory system has to deal with a number of sources of
 non-informative song variation. For instance, the temporal structure of the
-song pattern warps with temperature~(\bcite{skovmand1983song}). This also
-affects certain structural parameters that are essential for song recognition,
-mainly the duration of syllables and pauses. The auditory system can compensate
-for this variation by reading out relative temporal relationships rather than
-absolute time intervals~(\bcite{creutzig2009timescale};
-\bcite{creutzig2010timescale}). The ratio of syllable duration to pause
-duration is relatively constant across temperatures and has been shown to be
-suitable for song recognition~(\bcite{helversen1972gesang}), so that there is
-likely no need to retain any information about the absolute duration of
-syllables and pauses.
+song pattern warps with temperature~(\bcite{skovmand1983song}). The auditory
+system can compensate for this time warping by reading out relative temporal
+relationships, such as the ratio of syllable duration to pause duration, rather
+than the absolute time intervals~(\bcite{creutzig2009timescale};
+\bcite{creutzig2010timescale}). This allows for reliable song recognition
+across different temperatures~(\bcite{helversen1972gesang}). Accordingly, the
+auditory system does likely not retain any information about the precise
+duration of syllables and pauses.

+% Hard case - When a parameter is both relevant and irrelevant across functions:
 The situation is more complex for variations in song intensity. Song intensity
 at the receiver's position depends mostly on the distance to the sender and is
-hence not a reliable cue to infer species identity. The auditory system should
-therefore be invariant to intensity variations to recognize conspecific songs
+therefore not a reliable cue to infer species identity. The auditory system
+must hence be invariant to intensity variations to recognize conspecific songs
 regardless of sender distance. However, song intensity --- specifically, the
-interaural intensity difference --- is also required for directional hearing,
-which is essential for phonotaxis~(\bcite{helversen1988interaural}). Conflicts
-between song recognition and directional hearing are avoided in the auditory
-system by distributing both functions across two parallel
+interaural intensity difference --- is also a relevant cue for directional
+hearing, which is essential for phonotaxis~(\bcite{helversen1988interaural}).
+Interference between song recognition and directional hearing is avoided in the
+auditory system by distributing both functions across two parallel
 pathways~(\bcite{helversen1984parallel}; \bcite{ronacher1986routes}). This is
 the main reason why our model pathway is focused entirely on song recognition
-and has no capacity for directional hearing, no matter how relevant it may be
-to the grasshopper.
+and has no capacity for directional hearing, even though it is crucial to the
+grasshopper's behavior.

-Furthermore, "invariance to variations in song intensity" does not do justice
-to the full extent of the problem. Intensity is a function of song amplitude
-within a certain time frame. It can refer to the individual syllables and
-pauses of the song pattern as well as the entire song --- the former is
-relevant for song recognition, while the latter is not. Intensity invariance in
-the current context can therefore be described as time scale-selective
-sensitivity to the faster amplitude dynamics of the song pattern and
-simultaneous insensitivity to slower, more sustained amplitude dynamics. In the
-model pathway, this time scale selectivity is reflected by the cutoff frequency
-$\fc$ of the highpass filter that underlies the adaptation of $\adapt(t)$: Most
-$\fc$ are effective in removing the local offset of $\db(t)$ and render
-$\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$ will leave the
-relevant amplitude dynamics of the song pattern intact.
+% Hard case+ - When a parameter is both relevant and irrelevant within a function:
+Song intensity is a function of the song amplitudes within a certain time
+frame. "Invariance to variations in song intensity" is hence entirely a matter
+of time scales. It can refer to intensity variations across different
+songs~(longer time scales) or intensity variations across the syllables within
+a song~(shorter time scales), but also to the intensity difference that
+differentiates a syllable from a pause~(very short time scales). The time scale
+of intensity invariance must therefore be sufficiently long to leave the
+syllables and pauses of the song pattern intact. In the model pathway, this
+time scale-selectivity is reflected by the cutoff frequency $\fc$ of the
+highpass filter that underlies the adaptation of $\adapt(t)$: Most $\fc$ except
+the lowest ones are effective in removing the local offset of $\db(t)$ and
+render $\adapt(t)$ intensity-invariant, but only sufficiently low $\fc$
+preserve the relevant amplitude dynamics of the song pattern. Intensity
+invariance by thresholding and temporal averaging also has a relevant time
+scale, which is determined by the averaging interval $\tlp$. However, this time
+scale is not constrained by the need to preserve the temporal structure of the
+song pattern but to provide a suitable degree of temporal integration across
+the song pattern~(Section\,\ref{sec:constant_feat}).

 \subsection{Intensity invariance versus SNR}

@@ -1795,6 +1839,7 @@ logarithmically compressed stimulus intensities are a common property of
 sensory neurons across various modalities~(SOURCE?), and neurons of the
 grasshopper auditory system are no exception~(\bcite{suga1960peripheral};
 \bcite{gollisch2002energy}).
+\\$\rightarrow$ \bcite{ronacher2004neuronal}

 \subsection{Implications for behavior in a natural acoustic environment}

@@ -1820,6 +1865,8 @@ all nearby individuals. Importantly, the limitation of intensity invariance by
 SNR likely applies to all grasshoppers regardless of species, so that the
 behavioral strategies could be shared among the species that coexist in a given
 habitat.
+\\ \bcite{kramer2018robustness}
+\\ \bcite{einhaupl2011attractiveness}

 % Because the presumed restriction of song recognition
 % by means of the noise floor applies to all grasshoppers in a certain area,