Kind of done drafting the discussion. Needs polishing.
This commit is contained in:
240
main.tex
240
main.tex
@@ -258,14 +258,15 @@ substrate for conspecific song recognition and response
|
||||
initiation~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
|
||||
\bcite{bhavsar2017brain}).
|
||||
|
||||
Functionally, the ascending neurons are the most diverse of the three neuronal
|
||||
populations. Around 15 to 20 ascending neurons have been identified in the
|
||||
grasshopper auditory system~(\bcite{stumpner1991auditory}). Individual
|
||||
ascending neurons possess highly specific response properties that contrast
|
||||
with the rather homogeneous response properties of the preceding receptor
|
||||
neurons and local interneurons~(\bcite{clemens2011efficient}), which indicates
|
||||
a transition from a uniform population-wide processing stream into several
|
||||
parallel branches. Accordingly, the model pathway is divided into two distinct
|
||||
Around 15 to 20 ascending neurons have been identified in the grasshopper
|
||||
auditory system~(\bcite{stumpner1991auditory}), whose functional
|
||||
characteristics are conserved even between species that are not closely
|
||||
related~(\bcite{neuhofer2008evolutionarily}). The population of ascending
|
||||
neurons possesses a diverse range of response properties that contrasts with
|
||||
the rather homogeneous responses of receptor neurons and local
|
||||
interneurons~(\bcite{clemens2011efficient}), which suggests a transition from a
|
||||
uniform population-wide processing stream into several parallel branches.
|
||||
Accordingly, the model pathway is divided into two distinct
|
||||
stages~(Fig.\,\ref{fig:pathway}d): The preprocessing stage incorporates the
|
||||
processing steps at the levels of the tympanal membrane, the receptor neurons,
|
||||
and the local interneurons; and operates on one-dimensional signal
|
||||
@@ -275,6 +276,26 @@ downstream towards the SEG; and operates on high-dimensional signal
|
||||
representations~(Fig.\,\ref{fig:stages_feat}). The details of each
|
||||
physiological processing step and its functional approximation are described in
|
||||
the following sections.
|
||||
|
||||
Around 15 to 20 ascending neurons have been identified in the grasshopper
|
||||
auditory system~(\bcite{stumpner1991auditory}), whose functional
|
||||
characteristics are conserved even between species that are not closely
|
||||
related~(\bcite{neuhofer2008evolutionarily}). The population of ascending
|
||||
neurons possesses a diverse range of response properties that contrasts with
|
||||
the rather homogeneous responses of receptor neurons and local
|
||||
interneurons~(\bcite{clemens2011efficient}), which suggests a transition from a
|
||||
uniform population-wide processing stream into several parallel branches.
|
||||
Accordingly, the model pathway is divided into two distinct
|
||||
stages~(Fig.\,\ref{fig:pathway}d): The preprocessing stage incorporates the
|
||||
processing steps at the levels of the tympanal membrane, the receptor neurons,
|
||||
and the local interneurons; and operates on one-dimensional signal
|
||||
representations~(Fig.\,\ref{fig:stages_pre}). The feature extraction stage
|
||||
corresponds to the processing within the ascending neurons and further
|
||||
downstream towards the SEG; and operates on high-dimensional signal
|
||||
representations~(Fig.\,\ref{fig:stages_feat}). The details of each
|
||||
physiological processing step and its functional approximation are described in
|
||||
the following sections.
|
||||
|
||||
\begin{figure}[!ht]
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{figures/fig_auditory_pathway.pdf}
|
||||
@@ -1549,6 +1570,7 @@ have not been subject to decades of study will likely not be suitable for this
|
||||
approach yet.
|
||||
|
||||
\subsection{Feature representation, temporal averaging, and song design}
|
||||
\label{sec:constant_feat}
|
||||
|
||||
The feature set is the final song representation along the model pathway and
|
||||
constitutes the basis for song recognition. Each feature $f_i(t)$ results from
|
||||
@@ -1703,12 +1725,15 @@ lowpass filter --- the lower $\fc$, the higher the SNR of $\env(t)$ at a given
|
||||
$\sca$. However, $\fc$ must not be too low to avoid the attenuation of relevant
|
||||
amplitude dynamics of the song pattern. The saturation level of $\adapt$,
|
||||
unlike its saturation point, is independent of the SNR of $\env(t)$ because the
|
||||
influence of $\noc(t)$ is negligible for sufficiently large $\sca$. Both the
|
||||
saturation level and the saturation point of $\adapt(t)$ vary between different
|
||||
species and specific songs. These differences are likely rooted in the way in
|
||||
which logarithmic compression acts on the specific distribution of $\env(t)$,
|
||||
which is determined by $\fc$ and the structure and frequency spectrum of the
|
||||
rectified $\filt(t)$.
|
||||
influence of $\noc(t)$ is negligible for sufficiently large $\sca$. The output
|
||||
SNR of $\adapt(t)$ saturates at a comparably low value of around 10. This might
|
||||
in parts be a consequence of the logarithm, which compresses different higher
|
||||
intensities but also amplifies lower intensities, including the noise floor.
|
||||
Both the saturation level and the saturation point of $\adapt(t)$ vary between
|
||||
different species and individual songs. These differences are likely rooted in
|
||||
the way in which logarithmic compression acts on the specific distribution of
|
||||
$\env(t)$, which is determined by $\fc$ as well as the temporal structure and
|
||||
frequency spectrum of the rectified $\filt(t)$.
|
||||
|
||||
Thresholding and temporal averaging renders feature $f_i(t)$
|
||||
intensity-invariant for sufficiently large $\sca$. The trade-off between
|
||||
@@ -1720,71 +1745,103 @@ invariance by the previous mechanism is neglected. The SNR of $f_i(t)$ is
|
||||
therefore determined solely by the pure-noise response of $f_i(t)$. The
|
||||
distribution $\pci$ of the pure-noise kernel response $c_i(t)$ is largely a
|
||||
normal distribution with mean $\mu\approx0$ for all kernels $k_i(t)$. The value
|
||||
of the pure-noise $f_i(t)$ is hence 0.5 for $\thr=0$ and decreases for larger
|
||||
of the pure-noise $f_i(t)$ is hence 0.5 for $\thr=0$ and decreases for higher
|
||||
$\thr$. If $\thr$ is set above the maximum of $c_i(t)$, the pure-noise feature
|
||||
value is 0, which results in an "unlimited" SNR of $f_i(t)$ at the cost of a
|
||||
higher saturation point. In this case, any non-zero feature value that is
|
||||
sustained for a sufficient duration could serve as indicator for the presence
|
||||
of $\soc(t)$ in addition to $\noc(t)$. This requires a fine evolutionary tuning
|
||||
of $\thr$ to the properties of both the species-specific song and the natural
|
||||
value is 0, which results in an "unlimited" SNR of $f_i(t)$. In this case, any
|
||||
non-zero feature value that is sustained for a sufficient duration could serve
|
||||
as indicator for the presence of $\soc(t)$, although at the cost of a higher
|
||||
saturation point. The maximum of the pure-noise $c_i(t)$ is assumed to be very
|
||||
small due to the various SNR improvements along the pathway, so that the
|
||||
required increase in $\thr$ and hence the saturation point of $f_i(t)$ is not
|
||||
expected to be substantial. However, exploiting the capacity of $f_i(t)$ for
|
||||
arbitrarily high SNR would certainly require a fine evolutionary tuning of
|
||||
$\thr$ to the properties of both the species-specific song and the natural
|
||||
noise in a certain habitat.
|
||||
|
||||
|
||||
It seems reasonable to assume that $\thr$ is one of the parameters along the
|
||||
pathway
|
||||
|
||||
Physiologically, it is presumably easier to
|
||||
manipulate $\thr$
|
||||
|
||||
|
||||
It seems reasonable that $\thr$ is easier to
|
||||
manipulate in ev
|
||||
|
||||
|
||||
Furthermore, $\thr$ is presumably a parameter along
|
||||
the pathway that
|
||||
|
||||
|
||||
$\thr$
|
||||
|
||||
|
||||
Furthermore, $\thr$ might be one of the parameters
|
||||
along the pathway
|
||||
|
||||
|
||||
|
||||
% However, the parameters that determine the SNR of $\adapt(t)$ are much less
|
||||
% understood and likely relate to properties of the signal, whereas the SNR of
|
||||
% $f(t)$ depends on the choice of $\Theta$ and can be more directly manipulated
|
||||
% by the system.
|
||||
|
||||
\newpage
|
||||
\textbf{Thresh-LP: Implication for intensity invariance:}\\
|
||||
|
||||
- Suggests a relatively simple rule for optimal choice of threshold value $\thr$:\\
|
||||
$\rightarrow$ Find amplitude $c_i$ that maximizes absolute derivative of $c_i(t)$ over time\\
|
||||
$\rightarrow$ Optimal with respect to intensity invariance of $f_i(t)$, not necessarily for
|
||||
other criteria such as song-noise separation or diversity between features
|
||||
|
||||
- Nonlinear operations can be used to detach representations from graded physical
|
||||
stimulus (to fasciliate categorical behavioral decision-making?):\\
|
||||
1) Capture sufficiently precise amplitude information: $\env(t)$, $\adapt(t)$\\
|
||||
$\rightarrow$ Closely following the AM of the acoustic stimulus\\
|
||||
2) Quantify relevant stimulus properties on a graded scale: $c_i(t)$\\
|
||||
$\rightarrow$ More decorrelated representation, compared to prior stages\\
|
||||
3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $b_i(t)$\\
|
||||
$\rightarrow$ Trading a graded scale for two or more categorical states\\
|
||||
4) Represent stimulus properties under relevance constraint: $f_i(t)$\\
|
||||
$\rightarrow$ Graded again but highly decorrelated from the acoustic stimulus\\
|
||||
5) Categorical behavioral decision-making requires further nonlinearities\\
|
||||
$\rightarrow$ Parameters of a behavioral response may be graded (e.g. approach speed),
|
||||
initiation of one behavior over another is categorical (e.g. approach/stay)
|
||||
|
||||
\subsection{Intensity invariance versus intensity invariance}
|
||||
|
||||
Two consecutive mechanisms of intensity invariance do not necessarily add up to
|
||||
a stronger overall intensity invariance. If the first mechanism results in a
|
||||
lower saturation point than the second mechanism by itself, the saturation
|
||||
point of feature $f_i(t)$ will be determined solely by the first mechanism. In
|
||||
this case, the saturation level of $f_i(t)$ will conform to the intensity that
|
||||
$f_i(t)$ can reach for the given saturation point rather than the intrinsic
|
||||
saturation level of $f_i(t)$. Conversely, if the second mechanism results in a
|
||||
lower saturation point than the first mechanism, both the saturation point and
|
||||
the saturation level of $f_i(t)$ will be determined by the second mechanism.
|
||||
The saturation points of $f_i(t)$ across the set are distributed over a much
|
||||
wider range than those of the preceeding kernel responses $c_i(t)$, which
|
||||
suggests that the interaction between the two mechanisms is specific to
|
||||
individual kernels $k_i(t)$. A number of $f_i(t)$ achieve a lower saturation
|
||||
point than the respective $c_i(t)$, while some $f_i(t)$ exhibit similar or only
|
||||
marginally lower saturation points. This raises the question whether two
|
||||
consecutive mechanisms of intensity invariance are actually beneficial for the
|
||||
overall system.
|
||||
|
||||
From a purely functional perspective, the answer could be that logarithmic
|
||||
compression and adaptation is a necessary preprocessing step towards a robust
|
||||
feature representation, even if thresholding and temporal averaging alone would
|
||||
be sufficient to render $f_i(t)$ intensity-invariant. This preprocessing likely
|
||||
improves the temporal regularity of the song pattern in $\adapt(t)$ and
|
||||
$c_i(t)$, which is required for constant $f_i(t)$ across the duration of a
|
||||
song~(Section\,\ref{sec:constant_feat}). It also ensures consistency between
|
||||
the distribution $\pci$ of $c_i(t)$ across songs of different intensity, which
|
||||
is essential for the generation of consistent species-specific $f_i(t)$ under a
|
||||
static $\thr$. From a physiological perspective, the answer is likely that
|
||||
neurons possess only a limited firing rate for encoding stimulus intensities
|
||||
that can range over several orders of magnitude. Sigmoidal tuning curves over
|
||||
logarithmically compressed stimulus intensities are a common property of
|
||||
sensory neurons across various modalities~(SOURCE?), and neurons of the
|
||||
grasshopper auditory system are no exception~(\bcite{suga1960peripheral};
|
||||
\bcite{gollisch2002energy}).
|
||||
|
||||
\subsection{Implications for behavior in a natural acoustic environment}
|
||||
|
||||
% RIPPED FROM INTRODUCTION:
|
||||
Most grasshoppers live in environments that are communally inhabited by
|
||||
numerous individuals from multiple species. Their acoustic environment is
|
||||
characterized by noise from various sources --- abiotic ones like wind and
|
||||
water, but also the songs of both hetero- and conspecifics. This limits the SNR
|
||||
that each individual can achieve for its own song, and hence the effectiveness
|
||||
of the intensity-invariant processing in the auditory system. Producing higher
|
||||
song intensities is not a viable solution to this problem, because these also
|
||||
contribute to the overall noise floor. A possible behavioral solution could be
|
||||
to produce songs in a "turn-taking" manner to avoid the temporal superposition
|
||||
of multiple songs into overly intense signals. This would also prevent the
|
||||
mutual distortion of the respective song pattern. Another solution could be to
|
||||
spatially separate from other nearby grasshoppers to spread the potential noise
|
||||
sources over a larger area. However, according to our analysis based on field
|
||||
recordings as well as previous work on the topic~(\bcite{lang2000acoustic}),
|
||||
reliable song recognition is limited to little more than 1\,m from the sender,
|
||||
so that a grasshopper also cannot afford to stay too far away from its
|
||||
conspecifics. A better solution may hence be to collectively produce songs at
|
||||
lower-than-possible intensities, which would reduce the overall noise floor for
|
||||
all nearby individuals. Importantly, the limitation of intensity invariance by
|
||||
SNR likely applies to all grasshoppers regardless of species, so that the
|
||||
behavioral strategies could be shared among the species that coexist in a given
|
||||
habitat.
|
||||
|
||||
% Because the presumed restriction of song recognition
|
||||
% by means of the noise floor applies to all grasshoppers in a certain area,
|
||||
% these strategies may not be specific to some of the species at this location.
|
||||
% Instead, they must be shared by all grasshopper species that coexist within a
|
||||
% portion of a given habitat, which would provide an important implication for
|
||||
% the evolution of grasshopper songs in communities of multiple species.
|
||||
|
||||
%%% RELICS OF INTRODUCTION %%%
|
||||
% - Nonlinear operations can be used to detach representations from graded physical
|
||||
% stimulus (to fasciliate categorical behavioral decision-making?):\\
|
||||
% 1) Capture sufficiently precise amplitude information: $\env(t)$, $\adapt(t)$\\
|
||||
% $\rightarrow$ Closely following the AM of the acoustic stimulus\\
|
||||
% 2) Quantify relevant stimulus properties on a graded scale: $c_i(t)$\\
|
||||
% $\rightarrow$ More decorrelated representation, compared to prior stages\\
|
||||
% 3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $b_i(t)$\\
|
||||
% $\rightarrow$ Trading a graded scale for two or more categorical states\\
|
||||
% 4) Represent stimulus properties under relevance constraint: $f_i(t)$\\
|
||||
% $\rightarrow$ Graded again but highly decorrelated from the acoustic stimulus\\
|
||||
% 5) Categorical behavioral decision-making requires further nonlinearities\\
|
||||
% $\rightarrow$ Parameters of a behavioral response may be graded (e.g. approach speed),
|
||||
% initiation of one behavior over another is categorical (e.g. approach/stay)
|
||||
|
||||
% Multi-species, multi-individual communally inhabited environments\\
|
||||
% - Temporal overlap: Simultaneous singing across individuals/species common\\
|
||||
@@ -1802,53 +1859,12 @@ initiation of one behavior over another is categorical (e.g. approach/stay)
|
||||
% recognize the ones produced by conspecifics, and make appropriate behavioral
|
||||
% decisions based on context (sender identity, song type, mate/rival quality)
|
||||
|
||||
% How can the auditory system of grasshoppers meet these challenges?\\
|
||||
% - What are the minimum functional processing steps required?\\
|
||||
% - Which known neuronal mechanisms can implement these steps?\\
|
||||
% - Which and how many stages along the auditory pathway contribute?\\
|
||||
% $\rightarrow$ What are the limitations of the system as a whole?
|
||||
|
||||
% How can a human observer conceive a grasshopper's auditory percepts?\\
|
||||
% - How to investigate the workings of the auditory pathway as a whole?\\
|
||||
% - How to systematically test effects and interactions of processing parameters?\\
|
||||
% - How to integrate the available knowledge on anatomy, physiology, ethology?\\
|
||||
% $\rightarrow$ Abstract, simplify, formalize $\rightarrow$ Functional model framework
|
||||
|
||||
\textbf{Differences between the model pathway and the previous framework:}
|
||||
In the first step, a bank of parallel linear-nonlinear feature detectors is
|
||||
applied to the input signal. Each feature detector consists of a convolutional
|
||||
filter and a subsequent sigmoidal nonlinearity. The outputs of these feature
|
||||
detectors are temporally averaged to obtain a single feature value per
|
||||
detector, which is then assigned a specific weight. The linear combination of
|
||||
weighted feature values results in a single preference value, that serves as
|
||||
predictor for the behavioral response of the animal to the presented input
|
||||
signal. Our model pathway adopts the general structure of the existing
|
||||
framework but modifies it in several key aspects. The convolutional filters,
|
||||
which have previously been fitted to behavioral data for each individual
|
||||
species~(\bcite{clemens2013computational}), are replaced by a larger, generic
|
||||
set of unfitted Gabor basis functions in order to cover a wide range of
|
||||
possible song features across different species. Gabor functions approximate
|
||||
the general structure of the filters used in the existing framework as well as
|
||||
the filter functions found in various auditory neurons~(\bcite{rokem2006spike};
|
||||
\bcite{clemens2011efficient}; \bcite{clemens2012nonlinear}). The fitted
|
||||
sigmoidal nonlinearities in the existing framework consistently exhibited very
|
||||
steep slopes and are therefore replaced by shifted Heaviside step-functions,
|
||||
which results in a binarization of the feature detector outputs. Another, more
|
||||
substantial modification is that the feature detector outputs are temporally
|
||||
averaged in a way that does not condense them into single feature values but
|
||||
retains their time-varying structure. This is in line with the fact that songs
|
||||
are no discrete units but part of a continuous acoustic stream that the
|
||||
auditory system has to process in real time. Moreover, a time-varying feature
|
||||
representation only stabilizes after a certain delay following the onset of a
|
||||
song, which emphasizes the temporal dynamics of evidence accumulation towards a
|
||||
final categorical decision. The most notable difference between our model
|
||||
pathway and the existing framework, however, lays in the addition of a
|
||||
physiologically inspired preprocessing stage, whose starting point corresponds
|
||||
to the initial reception of airborne sound waves. This allows the model to
|
||||
operate on unmodified recordings of natural grasshopper songs instead of
|
||||
condensed pulse train approximations, which widens its scope towards more
|
||||
realistic, ecologically relevant scenarios.
|
||||
|
||||
\newpage
|
||||
\section{Appendix}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user