Polished most of yesterday's work.

This commit is contained in:
j-hartling
2026-01-23 16:18:56 +01:00
parent 51417ab634
commit 3b757ac9e1
12 changed files with 317 additions and 242 deletions

126
main.tex
View File

@@ -191,19 +191,19 @@ degree of intensity invariance.
% (Still can't stand some of this paragraph's structure and wording...)
Invariance to non-informative song variations is crucial for reliable song
recognition; however, it is not sufficient to this end. In order to recognize a
conspecific song as such, the auditory system also needs to extract
sufficiently informative features of the song pattern and then integrate the
gathered information into a final categorical percept. Previous authors have
proposed a functional model framework that describes this process --- feature
extraction, evidence accumulation, and categorical decision making --- in both
crickets~(\bcite{clemens2013computational}) and
conspecific song as such, the auditory system needs to extract sufficiently
informative features of the song pattern and then integrate the gathered
information into a final categorical percept. Previous authors have proposed a
functional model framework that describes this process --- feature extraction,
evidence accumulation, and categorical decision making --- in both
crickets~(\bcite{clemens2013computational}, \bcite{hennig2014time}) and
grasshoppers~(\bcite{clemens2013feature}, review on
both:~\bcite{ronacher2015computational}). Their framework provides a
comprehensible and biologically plausible account of the computational
mechanisms required for species-specific song recognition. As such, it has
served as the inspiration for the development of the model pathway we propose
here. The existing framework relies on pulse trains as input signals, which
were designed to capture the essential structural properties of natural song
mechanisms required for species-specific song recognition, which has served as
the inspiration for the development of the model pathway we propose here. The
existing framework relies on pulse trains as input signals, which were designed
to capture the essential structural properties of natural song
envelopes~(\bcite{clemens2013feature}). In the first step, a bank of parallel
linear-nonlinear feature detectors is applied to the input signal. Each feature
detector consists of a convolutional filter and a subsequent sigmoidal
@@ -215,77 +215,41 @@ animal to the presented input signal. Our model pathway adopts the general
structure of the existing framework but modifies it in several key aspects. The
convolutional filters, which have previously been fitted to behavioral data for
each individual species~(\bcite{clemens2013computational}), are replaced by a
larger, more general set of unfitted Gabor kernels in order to cover a wide
range of possible song features for as many species as possible. Gabor kernels
closely resemble the structure of the filters used in previous
models~(\bcite{clemens2013computational}, \bcite{clemens2013feature},
\bcite{hennig2014time}) as well as the measured spike-triggered averages of
higher-order interneurons in the grasshopper auditory
system~(\bcite{clemens2011efficient}). The fitted sigmoidal nonlinearities in
the existing framework consistently exhibited very steep
slopes~(\bcite{clemens2013computational}, \bcite{clemens2013feature}) and are
therefore approximated by simpler shifted Heaviside step-functions in our
model. Another, more substantial modification is that the outputs of the
feature detectors are temporally averaged in a way that does not condense them
into single feature values but retains their time-varying structure. A
time-varying feature representation introduces a certain time constant until
the representation stabilizes after the onset of a song. This reflects the
continuous nature of acoustic input and auditory perception, as songs are not
received as discrete units but processed continuously prior to the final
categorical decision to initate a behavioral response~(SOURCE LOL). The
most notable difference between our model pathway and the existing framework,
however, lays in the addition of a physiologically inspired preprocessing
portion, whose starting point corresponds to the initial reception of airborne
sound waves. This allows the model to operate on unmodified recordings of
natural grasshopper songs instead of condensed pulse train approximations,
which widens its scope towards more realistic, ecologically relevant scenarios.
For instance, we were able to investigate the contribution of different
processing stages to the emergence of intensity-invariant song representations
based on actual field recordings of songs at different distances from the
sender.
% Invariance to non-informative song variations is crucial for reliable song
% recognition; however, it is not sufficient to this end. In order to recognize a
% conspecific song as such, the auditory system needs to extract sufficiently
% informative features of the song pattern and then integrate the gathered
% information into a final categorical percept. Previous authors have proposed a
% biologically plausible functional framework that describes this process ---
% feature extraction, evidence accumulation, and categorical decision making ---
% in both crickets~(\bcite{clemens2013computational}) and
% grasshoppers~(\bcite{clemens2013feature}, review on
% both:~\bcite{ronacher2015computational}). Their framework provides a
% comprehensive, generalizable account of the computational mechanisms required
% for species-specific song recognition, which has served as inspiration for the development of our own model of the grasshopper auditory
% pathway, where it now constitutes the foundation of the recognition mechanism.
% According to the existing framework, a bank of parallel linear-nonlinear
% feature detectors is initially applied to the input signal. Each feature
% detector consists of a convolutional filter and a subsequent sigmoidal
% nonlinearity. The outputs of these feature detectors are temporally averaged to
% obtain a single feature value per detector, which is assigned a specific
% weight. The linear combination of weighted feature values results in a single
% preference value, which then serves as predictor for the behavioral response of
% the animal. The model we propose here adopts this general structure but
% modifies and several key aspects.
\textbf{Precursor work for model construction (special thanks to authors):}
Linear-nonlinear modelling of behavioral responses to artificial songs\\
- Feature expansion as implemented in our model: Major contribution!\\
- Bank of linear filters, nonlinearity, temporal integration, feature weighting\\
$\rightarrow$ \cite{clemens2013computational} (crickets)\\
$\rightarrow$ \cite{clemens2013feature} (grasshoppers)\\
$\rightarrow$ \cite{ronacher2015computational}\\
\textbf{Own advancements/key differences}:\\
1) Used boxcar functions as artificial "songs" (focus on few key parameters)\\
$\rightarrow$ Now actual, variable songs (as naturalistic as possible)\\
2) Fitted filters to behavioral data\\
$\rightarrow$ More general, simpler, unfitted formalized Gabor filter bank
larger, generic set of unfitted Gabor basis functions in order to cover a wide
range of possible song features across different species. Gabor functions
approximate the general structure of the filters used in the existing framework
as well as the filter functions found in various auditory
neurons~(\bcite{rokem2006spike}, \bcite{clemens2011efficient},
\bcite{clemens2012nonlinear}). The fitted sigmoidal nonlinearities in the
existing framework consistently exhibited very steep slopes and are therefore
replaced by shifted Heaviside step-functions, which results in a binarization
of the feature detector outputs. Another, more substantial modification is that
the feature detector outputs are temporally averaged in a way that does not
condense them into single feature values but retains their time-varying
structure. This is in line with the fact that songs are no discrete units but
part of a continuous acoustic stream that the auditory system has to process in
real time. Moreover, a time-varying feature representation only stabilizes
after a certain delay following the onset of a song, which emphasizes the
temporal dynamics of evidence accumulation towards a final categorical
decision. The most notable difference between our model pathway and the
existing framework, however, lays in the addition of a physiologically inspired
preprocessing portion, whose starting point corresponds to the initial
reception of airborne sound waves. This allows the model to operate on
unmodified recordings of natural grasshopper songs instead of condensed pulse
train approximations, which widens its scope towards more realistic,
ecologically relevant scenarios. For instance, we were able to investigate the
contribution of different processing stages to the emergence of
intensity-invariant song representations based on actual field recordings of
songs at different distances from the sender.
% Forgive me, it's friday.
In the following, we outline the structure of the proposed model of the
grasshopper auditory pathway, from the initial sound reception at the tympanal
membrane up to the generation of a high-dimensional, time-varying feature
representation that is suitable for species-specific song recognition. We
provide a side-by-side account of the known physiological processing steps and
their functional approximation by basic mathematical operations. We then
elaborate on two key mechanisms that drive the emergence of intensity-invariant
song representations within the auditory pathway.
% SCRAPPED UNTIL FURTHER NOTICE:
% Multi-species, multi-individual communally inhabited environments\\