Grinding through methods (WIP).

This commit is contained in:
j-hartling
2026-02-10 16:24:47 +01:00
parent 1c4701f98c
commit 015a3032c1
13 changed files with 632 additions and 524 deletions

222
main.tex
View File

@@ -10,6 +10,8 @@
\usepackage{parskip}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{subcaption}
\usepackage[labelfont=bf, textfont=small]{caption}
\usepackage[separate-uncertainty=true, locale=DE]{siunitx}
\sisetup{output-exponent-marker=\ensuremath{\mathrm{e}}}
% \usepackage[capitalize]{cleveref}
@@ -60,9 +62,16 @@
\newcommand{\adapt}{\raw_{\text{adapt}}} % Adapted signal
% Math shorthands - Kernel parameters:
\newcommand{\ks}{\sigma_i} % Gabor kernel width
\newcommand{\kf}{f_i} % Gabor kernel frequency
\newcommand{\kp}{\phi_i} % Gabor kernel phase
\newcommand{\kw}{\sigma} % Unspecific Gabor kernel width
\newcommand{\kf}{\omega} % Unspecific Gabor kernel frequency
\newcommand{\kp}{\phi} % Unspecific Gabor kernel phase
\newcommand{\kn}{n} % Unspecific Gabor kernel lobe number
\newcommand{\ks}{s} % Unspecific Gabor kernel sign
\newcommand{\kwi}{\kw_i} % Specific Gabor kernel width
\newcommand{\kfi}{\kf_i} % Specific Gabor kernel frequency
\newcommand{\kpi}{\kp_i} % Specific Gabor kernel phase
\newcommand{\kni}{\kn_i} % Specific Gabor kernel lobe number
\newcommand{\ksi}{\ks_i} % Specific Gabor kernel sign
% Math shorthands - Threshold nonlinearity:
\newcommand{\thr}{\Theta_i} % Step function threshold value
@@ -142,9 +151,9 @@ parameters of this pattern, such as the duration of syllables and
pauses~(\bcite{helversen1972gesang}), the slope of pulse
onsets~(\bcite{helversen1993absolute}), and the accentuation of syllable onsets
relative to the preceeding pause~(\bcite{balakrishnan2001song};
\bcite{helversen2004acoustic}). The amplitude modulation, or envelope, of the
song is sufficient for recognition~(\bcite{helversen1997recognition}). However,
the essential recognition cues can vary considerably with external physical
\bcite{helversen2004acoustic}). The amplitude modulation of the song is
sufficient for recognition~(\bcite{helversen1997recognition}). However, the
essential recognition cues can vary considerably with external physical
factors, which requires the auditory system to be invariant to such variations
in order to reliably recognize songs under different conditions. For instance,
the temporal structure of grasshopper songs warps with
@@ -173,12 +182,12 @@ general:~\bcite{benda2021neural}). In the grasshopper auditory system, a number
of neuron types along the processing chain exhibit spike-frequency adaptation
in response to sustained stimulus
intensities~(\bcite{romer1976informationsverarbeitung};
\bcite{gollisch2002energy}; \bcite{hildebrandt2009origin};
\bcite{clemens2010intensity}) and thus likely contribute to the emergence of
intensity-invariant song representations. This means that intensity invariance
is not the result of a single processing step but rather a gradual process, in
which different neuronal populations contribute to varying
degrees~(\bcite{clemens2010intensity}) and by different
\bcite{gollisch2004input}; \bcite{hildebrandt2009origin};
\bcite{clemens2010intensity}; \bcite{fisch2012channel}) and thus likely
contribute to the emergence of intensity-invariant song representations. This
means that intensity invariance is not the result of a single processing step
but rather a gradual process, in which different neuronal populations
contribute to varying degrees~(\bcite{clemens2010intensity}) and by different
mechanisms~(\bcite{hildebrandt2009origin}). Approximating this process within a
functional model framework thus requires a considerable amount of
simplification. In this work, we demonstrate that even a small number of basic
@@ -287,137 +296,168 @@ The essence of constructing a functional model of a given system is to gain a
sufficient understanding of the system's essential structural components and
their presumed functional roles; and to then build a formal framework of
manageable complexity around these two aspects. Anatomically, the organization
of the grasshopper song recognition pathway can be outlined as a hierarchical
feed-forward network of three consecutive neuronal
populations~(Fig.\,\ref{fig:pathway}a-c): Peripheral auditory receptor neurons,
whose axons enter the ventral nerve cord at the level of the metathoracic
ganglion; local interneurons that remain exclusively within the thoracic region
of the ventral nerve cord; and ascending neurons projecting from the thoracic
region towards the supraesophageal ganglion~(\bcite{rehbein1974structure};
\bcite{rehbein1976auditory}; \bcite{eichendorf1980projections}). The input to
the network originates at the membrane of the tympanal organ, which acts as the
primary sound receiver and is coupled to the dendritic endings of the receptor
neurons~(\bcite{gray1960fine}). The outputs from the network converge somewhere
in the supraesophageal ganglion, which is presumed to harbor the neuronal
substrate for conspecific song recognition and response
of the grasshopper song recognition pathway can be outlined as a feed-forward
network of three consecutive neuronal
populations~(Fig.\,\mbox{\ref{fig:pathway}a-c}): Peripheral auditory receptor
neurons, whose axons enter the ventral nerve cord at the level of the
metathoracic ganglion; local interneurons that remain exclusively within the
thoracic region of the ventral nerve cord; and ascending neurons projecting
from the thoracic region towards the supraesophageal
ganglion~(\bcite{rehbein1974structure}; \bcite{rehbein1976auditory};
\bcite{eichendorf1980projections}). The input to the network originates at the
tympanal membrane, which acts as acoustic receiver and is coupled to the
dendritic endings of the receptor neurons~(\bcite{gray1960fine}). The outputs
from the network converge in the supraesophageal ganglion, which is presumed to
harbor the neuronal substrate for conspecific song recognition and response
initiation~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
\bcite{bhavsar2017brain}). Functionally, the ascending neurons are the most
diverse of the three populations along the pathway. Individual ascending
neurons possess highly specific response properties that contrast with the
homogeneous responses of the preceding receptor neurons and local
interneurons~(\bcite{clemens2011efficient}), indicating a transition from a
uniform population-wide processing stream into several parallel branches. Based
on these anatomical and physiological considerations, the overall structure of
the model pathway is divided into two distinct
rather homogeneous response properties of the preceding receptor neurons and
local interneurons~(\bcite{clemens2011efficient}), indicating a transition from
a uniform population-wide processing stream into several parallel branches.
Based on these anatomical and physiological considerations, the overall
structure of the model pathway is divided into two distinct
stages~(Fig.\,\ref{fig:pathway}d). The preprocessing stage incorporates the
known physiological processing steps at the levels of the tympanal membrane,
the receptor neurons, and the local interneurons; and operates on
one-dimensional signal representations. The feature extraction stage
corresponds to the processing within the ascending neurons and further
downstream towards the supraesophageal ganglion; and operates on
higher-dimensional signal representations. The details of each processing step
within the two stages are outlined in the following sections.
high-dimensional signal representations. The details of each physiological
processing step and its functional approximation within the two stages are
outlined in the following sections.
\begin{figure}[!ht]
\centering
\def\svgwidth{\textwidth}
\import{figures/}{fig_auditory_pathway.pdf_tex}
\caption{\textbf{Schematic organisation of the song recognition pathway in
grasshoppers compared to the structure of the model pathway.}
\textbf{a}:~Course of the pathway in the grasshopper, from
the tympanal membrane over receptor neurons (1st order),
local interneurons (2nd order) of the metathoracic
ganglion, and ascending neurons (3rd order) further
towards the central brain.
\textbf{b}:~Connections between the three neuronal
populations within the metathoracic ganglion.
grasshoppers compared to the structure of the functional
model pathway.}
\textbf{a}:~Simplified course of the pathway in the
grasshopper, from the tympanal membrane over receptor
neurons, local interneurons, and ascending neurons further
towards the supraesophageal ganglion.
\textbf{b}:~Schematic of synaptic connections between
the three neuronal populations within the metathoracic
ganglion.
\textbf{c}:~Network representation of neuronal connectivity.
\textbf{d}:~Flow diagram of the different signal
representations (boxes) and transformations (arrows) along
the model pathway. The pathway consists of a
population-wide preprocessing stream followed by several
parallel feature extraction streams.
representations and transformations along the model
pathway. All representations are time-varying. 1st half:
Preprocessing stage (one-dimensional). 2nd half: Feature
extraction stage (high-dimensional).
}
\label{fig:pathway}
\end{figure}
\subsection{Population-driven signal preprocessing}
Grasshoppers receive airborne sound waves by a tympanal organ at each side of
the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a
mechanical resonance filter: Vibrations that fall within specific frequency
bands are focused on different membrane areas, while others are
attenuated~(\bcite{michelsen1971frequency}; \bcite{windmill2008time};
\bcite{malkin2014energy}). This processing step can be approximated by an
initial bandpass filter
Grasshoppers receive airborne sound waves by a tympanal organ at either side of
the body. The tympanal membrane acts as a mechanical resonance filter for
sound-induced vibrations~(\bcite{windmill2008time}; \bcite{malkin2014energy}).
Vibrations that fall within specific frequency bands are focused on different
membrane areas, while others are attenuated. This processing step can be
approximated by an initial bandpass filter
\begin{equation}
\filt(t)\,=\,\raw(t)\,*\,\bp, \qquad \fc\,=\,5\,\text{kHz},\,30\,\text{kHz}
\label{eq:bandpass}
\end{equation}
applied to the acoustic input signal $\raw(t)$. The auditory receptor neurons
connect directly to the tympanal membrane~(Fig.\,\ref{fig:pathway}a). Besides
performing the mechano-electrical transduction, the receptor population is
substrate to several known processing steps. First, the receptors extract the
signal envelope~(\bcite{machens2001discrimination}), which likely involves a
rectifying nonlinearity~(\bcite{machens2001representation}). This can be
modelled as full-wave rectification followed by lowpass filtering
transduce the vibrations of the tympanal membrane into sequences of action
potentials. Thereby, they encode the amplitude modulation, or envelope, of the
signal~(\bcite{machens2001discrimination}), which likely involves a rectifying
nonlinearity~(\bcite{machens2001representation}). This can be modelled as
full-wave rectification followed by lowpass filtering
\begin{equation}
\env(t)\,=\,|\filt(t)|\,*\,\lp, \qquad \fc\,=\,500\,\text{Hz}
\label{eq:env}
\end{equation}
of the tympanal signal $\filt(t)$. Furthermore, the receptors exhibit a
sigmoidal response curve over logarithmically compressed intensity
levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model,
logarithmic compression is achieved by conversion to decibel scale
levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model
pathway, logarithmic compression is achieved by conversion to decibel scale
\begin{equation}
\db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max[\env(t)]
\label{eq:log}
\end{equation}
relative to the maximum intensity $\dbref$ of the signal envelope $\env(t)$.
Next, the axons of the receptor neurons project into the metathoracic ganglion,
where they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the
local interneurons~(\bcite{hildebrandt2009origin};
\bcite{clemens2010intensity}) and, to a lesser extent, the receptors
themselves~(\bcite{fisch2012channel}) display spike-frequency adaptation in
response to sustained stimulus intensity levels. This mechanism allows for the
robust encoding of faster amplitude modulations against a slowly changing
overall baseline intensity. Functionally, this processing step resembles a
highpass filter
Both the receptor neurons~(\bcite{romer1976informationsverarbeitung};
\bcite{gollisch2004input}; \bcite{fisch2012channel}) and, on a larger scale,
the subsequent local interneurons~(\bcite{hildebrandt2009origin};
\bcite{clemens2010intensity}) adapt their firing rates in response to sustained
stimulus intensity levels, which allows for the robust encoding of faster
amplitude modulations against a slowly changing overall baseline intensity.
Functionally, the adaptation mechanism resembles a highpass filter
\begin{equation}
\adapt(t)\,=\,\db(t)\,*\,\hp, \qquad \fc\,=\,10\,\text{Hz}
\label{eq:highpass}
\end{equation}
over the logarithmically scaled envelope $\db(t)$. The projections of the local
interneurons remain within the metathoracic ganglion and synapse onto a small
number of ascending neurons~(Fig.\,\ref{fig:pathway}b), which marks the
transition between the preprocessing stream and the parallel processing stream
of the model pathway.
over the logarithmically scaled envelope $\db(t)$. This processing step
concludes the preprocessing stage of the model pathway. The resulting
intensity-adapted envelope $\adapt(t)$ is then passed on from the local
interneurons to the ascending neurons, where it serves as the basis for the
following feature extraction stage.
\subsection{Feature extraction by individual neurons}
The small population of ascending neurons
\textbf{Stage-specific processing steps and functional approximations:}
Template matching by individual ANs\\
- Filter base (STA approximations): Set of Gabor kernels\\
- Gabor parameters: $\ks, \kp, \kf$ $\rightarrow$ Determines kernel sign and lobe number
%
\begin{equation}
k_i(t,\,\ks,\,\kf,\,\kp)\,=\,e^{-\frac{t^{2}}{2{\ks}^{2}}}\,\cdot\,\sin(2\pi\kf\,\cdot\,t\,+\,\phi_i)
\label{eq:gabor}
\end{equation}
%
$\rightarrow$ Separate convolution with each member of the kernel set
%
The ascending neurons extract and encode a number of different features of the
preprocessed signal. As a population, they hence represent the signal in a
higher-dimensional space than the preceding receptor neurons and local
interneurons. Each ascending neuron is assumed to scan the signal for a
specific template pattern, which can be thought of as a kernel of a particular
structure and on a particular time scale. This process, known as template
matching, can be modelled as a convolution
\begin{equation}
c_i(t)\,=\,\adapt(t)\,*\,k_i(t)
= \infint \adapt(\tau)\,\cdot\,k_i(t\,-\,\tau)\,d\tau
\label{eq:conv}
\end{equation}
%
of the intensity-adapted envelope $\adapt(t)$ with a kernel $k_i(t)$ per
ascending neuron. We used Gabor kernels as basis functions for creating
different preferred patterns. An arbitrary one-dimensional, real Gabor kernel
is generated by multiplication of a Gaussian envelope and a sinusoidal carrier
\begin{equation}
k_i(t,\,\kwi,\,\kfi,\,\kpi)\,=\,e^{-\frac{t^{2}}{2{\kwi}^{2}}}\,\cdot\,\sin(\kfi\,t\,+\,\kpi), \qquad \kfi\,=\,2\pi f_{sin}
\label{eq:gabor}
\end{equation}
with Gaussian standard deviation or kernel width $\kwi$, carrier frequency
$\kfi$, and carrier phase $\kpi$. Different combinations of $\kwi$, $\kfi$, and
$\kpi$ result in Gabor kernels with different lobe number $\kni$ and sign
$\ksi$. If the function space is constrained to only include mirror- or
point-symmetric Gabor kernels, frequency $\kf$ is related to lobe number $\kn$
by
\begin{equation}
\kp(\kn,\,\ks)\,=\,0.5\,\cdot\,(1\,-\,\text{mod}[\kn,\,2]\,+\,\ks)
\end{equation}
which results in the specific phase values shown in
Table\,\mbox{\ref{tab:gabor_phases}}.
\FloatBarrier
\begin{table}[!ht]
\centering
\captionsetup{width=.55\textwidth}
\caption{}
\begin{tabular}{|ccc|}
\hline
sign $\ks$ & even $\kn$ & odd $\kn$\\
\hline
+1 & $+\pi\,/\,2$ & $\pi$\\
-1 & $-\pi\,/\,2$ & $0$\\
\hline
\end{tabular}
\label{tab:gabor_phases}
\end{table}
\FloatBarrier
In order to create a Gabor kernel with a specific lobe number $\kn$ and kernel
width $\kw$, frequency $\kf$ has to be set to
\begin{equation}
\kf(\kn,\,\kw)\,=\,\frac{\kn}{2\pi\,\kw}
\end{equation}
\textbf{Stage-specific processing steps and functional approximations:}
Thresholding nonlinearity in ascending neurons (or further downstream)\\
- Binarization of AN response traces into "relevant" vs. "irrelevant"\\
$\rightarrow$ Shifted Heaviside step-function $\nl$ (or steep sigmoid threshold?)