Grinding through methods (WIP).

2026-02-10 16:24:47 +01:00
parent 1c4701f98c
commit 015a3032c1
13 changed files with 632 additions and 524 deletions
--- a/main.tex
+++ b/main.tex
@@ -10,6 +10,8 @@
 \usepackage{parskip}
 \usepackage{amsmath}
 \usepackage{amssymb}
+\usepackage{subcaption} 
+\usepackage[labelfont=bf, textfont=small]{caption}
 \usepackage[separate-uncertainty=true, locale=DE]{siunitx}
 \sisetup{output-exponent-marker=\ensuremath{\mathrm{e}}}
 % \usepackage[capitalize]{cleveref}
@@ -60,9 +62,16 @@
 \newcommand{\adapt}{\raw_{\text{adapt}}} % Adapted signal

 % Math shorthands - Kernel parameters:
-\newcommand{\ks}{\sigma_i} % Gabor kernel width
-\newcommand{\kf}{f_i} % Gabor kernel frequency
-\newcommand{\kp}{\phi_i} % Gabor kernel phase
+\newcommand{\kw}{\sigma} % Unspecific Gabor kernel width
+\newcommand{\kf}{\omega} % Unspecific Gabor kernel frequency
+\newcommand{\kp}{\phi} % Unspecific Gabor kernel phase
+\newcommand{\kn}{n} % Unspecific Gabor kernel lobe number
+\newcommand{\ks}{s} % Unspecific Gabor kernel sign
+\newcommand{\kwi}{\kw_i} % Specific Gabor kernel width
+\newcommand{\kfi}{\kf_i} % Specific Gabor kernel frequency
+\newcommand{\kpi}{\kp_i} % Specific Gabor kernel phase
+\newcommand{\kni}{\kn_i} % Specific Gabor kernel lobe number
+\newcommand{\ksi}{\ks_i} % Specific Gabor kernel sign

 % Math shorthands - Threshold nonlinearity:
 \newcommand{\thr}{\Theta_i} % Step function threshold value
@@ -142,9 +151,9 @@ parameters of this pattern, such as the duration of syllables and
 pauses~(\bcite{helversen1972gesang}), the slope of pulse
 onsets~(\bcite{helversen1993absolute}), and the accentuation of syllable onsets
 relative to the preceeding pause~(\bcite{balakrishnan2001song};
-\bcite{helversen2004acoustic}). The amplitude modulation, or envelope, of the
-song is sufficient for recognition~(\bcite{helversen1997recognition}). However,
-the essential recognition cues can vary considerably with external physical
+\bcite{helversen2004acoustic}). The amplitude modulation of the song is
+sufficient for recognition~(\bcite{helversen1997recognition}). However, the
+essential recognition cues can vary considerably with external physical
 factors, which requires the auditory system to be invariant to such variations
 in order to reliably recognize songs under different conditions. For instance,
 the temporal structure of grasshopper songs warps with
@@ -173,12 +182,12 @@ general:~\bcite{benda2021neural}). In the grasshopper auditory system, a number
 of neuron types along the processing chain exhibit spike-frequency adaptation
 in response to sustained stimulus
 intensities~(\bcite{romer1976informationsverarbeitung};
-\bcite{gollisch2002energy}; \bcite{hildebrandt2009origin};
-\bcite{clemens2010intensity}) and thus likely contribute to the emergence of
-intensity-invariant song representations. This means that intensity invariance
-is not the result of a single processing step but rather a gradual process, in
-which different neuronal populations contribute to varying
-degrees~(\bcite{clemens2010intensity}) and by different
+\bcite{gollisch2004input}; \bcite{hildebrandt2009origin};
+\bcite{clemens2010intensity}; \bcite{fisch2012channel}) and thus likely
+contribute to the emergence of intensity-invariant song representations. This
+means that intensity invariance is not the result of a single processing step
+but rather a gradual process, in which different neuronal populations
+contribute to varying degrees~(\bcite{clemens2010intensity}) and by different
 mechanisms~(\bcite{hildebrandt2009origin}). Approximating this process within a
 functional model framework thus requires a considerable amount of
 simplification. In this work, we demonstrate that even a small number of basic
@@ -287,137 +296,168 @@ The essence of constructing a functional model of a given system is to gain a
 sufficient understanding of the system's essential structural components and
 their presumed functional roles; and to then build a formal framework of
 manageable complexity around these two aspects. Anatomically, the organization
-of the grasshopper song recognition pathway can be outlined as a hierarchical
-feed-forward network of three consecutive neuronal
-populations~(Fig.\,\ref{fig:pathway}a-c): Peripheral auditory receptor neurons,
-whose axons enter the ventral nerve cord at the level of the metathoracic
-ganglion; local interneurons that remain exclusively within the thoracic region
-of the ventral nerve cord; and ascending neurons projecting from the thoracic
-region towards the supraesophageal ganglion~(\bcite{rehbein1974structure};
-\bcite{rehbein1976auditory}; \bcite{eichendorf1980projections}). The input to
-the network originates at the membrane of the tympanal organ, which acts as the
-primary sound receiver and is coupled to the dendritic endings of the receptor
-neurons~(\bcite{gray1960fine}). The outputs from the network converge somewhere
-in the supraesophageal ganglion, which is presumed to harbor the neuronal
-substrate for conspecific song recognition and response
+of the grasshopper song recognition pathway can be outlined as a feed-forward
+network of three consecutive neuronal
+populations~(Fig.\,\mbox{\ref{fig:pathway}a-c}): Peripheral auditory receptor
+neurons, whose axons enter the ventral nerve cord at the level of the
+metathoracic ganglion; local interneurons that remain exclusively within the
+thoracic region of the ventral nerve cord; and ascending neurons projecting
+from the thoracic region towards the supraesophageal
+ganglion~(\bcite{rehbein1974structure}; \bcite{rehbein1976auditory};
+\bcite{eichendorf1980projections}). The input to the network originates at the
+tympanal membrane, which acts as acoustic receiver and is coupled to the
+dendritic endings of the receptor neurons~(\bcite{gray1960fine}). The outputs
+from the network converge in the supraesophageal ganglion, which is presumed to
+harbor the neuronal substrate for conspecific song recognition and response
 initiation~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
 \bcite{bhavsar2017brain}). Functionally, the ascending neurons are the most
 diverse of the three populations along the pathway. Individual ascending
 neurons possess highly specific response properties that contrast with the
-homogeneous responses of the preceding receptor neurons and local
-interneurons~(\bcite{clemens2011efficient}), indicating a transition from a
-uniform population-wide processing stream into several parallel branches. Based
-on these anatomical and physiological considerations, the overall structure of
-the model pathway is divided into two distinct
+rather homogeneous response properties of the preceding receptor neurons and
+local interneurons~(\bcite{clemens2011efficient}), indicating a transition from
+a uniform population-wide processing stream into several parallel branches.
+Based on these anatomical and physiological considerations, the overall
+structure of the model pathway is divided into two distinct
 stages~(Fig.\,\ref{fig:pathway}d). The preprocessing stage incorporates the
 known physiological processing steps at the levels of the tympanal membrane,
 the receptor neurons, and the local interneurons; and operates on
 one-dimensional signal representations. The feature extraction stage
 corresponds to the processing within the ascending neurons and further
 downstream towards the supraesophageal ganglion; and operates on
-higher-dimensional signal representations. The details of each processing step
-within the two stages are outlined in the following sections.
+high-dimensional signal representations. The details of each physiological
+processing step and its functional approximation within the two stages are
+outlined in the following sections.

 \begin{figure}[!ht]
    \centering
    \def\svgwidth{\textwidth}
    \import{figures/}{fig_auditory_pathway.pdf_tex}
    \caption{\textbf{Schematic organisation of the song recognition pathway in
-                     grasshoppers compared to the structure of the model pathway.}
-                     \textbf{a}:~Course of the pathway in the grasshopper, from
-                     the tympanal membrane over receptor neurons (1st order),
-                     local interneurons (2nd order) of the metathoracic
-                     ganglion, and ascending neurons (3rd order) further
-                     towards the central brain.
-                     \textbf{b}:~Connections between the three neuronal
-                     populations within the metathoracic ganglion.
+                     grasshoppers compared to the structure of the functional
+                     model pathway.}
+                     \textbf{a}:~Simplified course of the pathway in the
+                     grasshopper, from the tympanal membrane over receptor
+                     neurons, local interneurons, and ascending neurons further
+                     towards the supraesophageal ganglion.
+                     \textbf{b}:~Schematic of synaptic connections between
+                     the three neuronal populations within the metathoracic
+                     ganglion.
                     \textbf{c}:~Network representation of neuronal connectivity.
                     \textbf{d}:~Flow diagram of the different signal
-                     representations (boxes) and transformations (arrows) along
-                     the model pathway. The pathway consists of a
-                     population-wide preprocessing stream followed by several
-                     parallel feature extraction streams.
+                     representations and transformations along the model
+                     pathway. All representations are time-varying. 1st half:
+                     Preprocessing stage (one-dimensional). 2nd half: Feature
+                     extraction stage (high-dimensional).
                     }
    \label{fig:pathway}
 \end{figure}

 \subsection{Population-driven signal preprocessing}

-Grasshoppers receive airborne sound waves by a tympanal organ at each side of
-the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a
-mechanical resonance filter: Vibrations that fall within specific frequency
-bands are focused on different membrane areas, while others are
-attenuated~(\bcite{michelsen1971frequency}; \bcite{windmill2008time};
-\bcite{malkin2014energy}). This processing step can be approximated by an
-initial bandpass filter
+Grasshoppers receive airborne sound waves by a tympanal organ at either side of
+the body. The tympanal membrane acts as a mechanical resonance filter for
+sound-induced vibrations~(\bcite{windmill2008time}; \bcite{malkin2014energy}).
+Vibrations that fall within specific frequency bands are focused on different
+membrane areas, while others are attenuated. This processing step can be
+approximated by an initial bandpass filter
 \begin{equation}
    \filt(t)\,=\,\raw(t)\,*\,\bp, \qquad \fc\,=\,5\,\text{kHz},\,30\,\text{kHz}
    \label{eq:bandpass}
 \end{equation}
 applied to the acoustic input signal $\raw(t)$. The auditory receptor neurons
-connect directly to the tympanal membrane~(Fig.\,\ref{fig:pathway}a). Besides
-performing the mechano-electrical transduction, the receptor population is
-substrate to several known processing steps. First, the receptors extract the
-signal envelope~(\bcite{machens2001discrimination}), which likely involves a
-rectifying nonlinearity~(\bcite{machens2001representation}). This can be
-modelled as full-wave rectification followed by lowpass filtering
+transduce the vibrations of the tympanal membrane into sequences of action
+potentials. Thereby, they encode the amplitude modulation, or envelope, of the
+signal~(\bcite{machens2001discrimination}), which likely involves a rectifying
+nonlinearity~(\bcite{machens2001representation}). This can be modelled as
+full-wave rectification followed by lowpass filtering
 \begin{equation}
    \env(t)\,=\,|\filt(t)|\,*\,\lp, \qquad \fc\,=\,500\,\text{Hz}
    \label{eq:env}
 \end{equation}
 of the tympanal signal $\filt(t)$. Furthermore, the receptors exhibit a
 sigmoidal response curve over logarithmically compressed intensity
-levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model,
-logarithmic compression is achieved by conversion to decibel scale
+levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model
+pathway, logarithmic compression is achieved by conversion to decibel scale
 \begin{equation}
    \db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max[\env(t)]
    \label{eq:log}
 \end{equation}
 relative to the maximum intensity $\dbref$ of the signal envelope $\env(t)$.
-Next, the axons of the receptor neurons project into the metathoracic ganglion,
-where they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the
-local interneurons~(\bcite{hildebrandt2009origin};
-\bcite{clemens2010intensity}) and, to a lesser extent, the receptors
-themselves~(\bcite{fisch2012channel}) display spike-frequency adaptation in
-response to sustained stimulus intensity levels. This mechanism allows for the
-robust encoding of faster amplitude modulations against a slowly changing
-overall baseline intensity. Functionally, this processing step resembles a
-highpass filter
+Both the receptor neurons~(\bcite{romer1976informationsverarbeitung};
+\bcite{gollisch2004input}; \bcite{fisch2012channel}) and, on a larger scale,
+the subsequent local interneurons~(\bcite{hildebrandt2009origin};
+\bcite{clemens2010intensity}) adapt their firing rates in response to sustained
+stimulus intensity levels, which allows for the robust encoding of faster
+amplitude modulations against a slowly changing overall baseline intensity.
+Functionally, the adaptation mechanism resembles a highpass filter
 \begin{equation}
    \adapt(t)\,=\,\db(t)\,*\,\hp, \qquad \fc\,=\,10\,\text{Hz}
    \label{eq:highpass}
 \end{equation}
-over the logarithmically scaled envelope $\db(t)$. The projections of the local
-interneurons remain within the metathoracic ganglion and synapse onto a small
-number of ascending neurons~(Fig.\,\ref{fig:pathway}b), which marks the
-transition between the preprocessing stream and the parallel processing stream
-of the model pathway.
+over the logarithmically scaled envelope $\db(t)$. This processing step
+concludes the preprocessing stage of the model pathway. The resulting
+intensity-adapted envelope $\adapt(t)$ is then passed on from the local
+interneurons to the ascending neurons, where it serves as the basis for the
+following feature extraction stage.

 \subsection{Feature extraction by individual neurons}

-The small population of ascending neurons
-
-
-
-\textbf{Stage-specific processing steps and functional approximations:}
-
-Template matching by individual ANs\\
- Filter base (STA approximations): Set of Gabor kernels\\
- Gabor parameters: $\ks, \kp, \kf$ $\rightarrow$ Determines kernel sign and lobe number
-%
-\begin{equation}
-    k_i(t,\,\ks,\,\kf,\,\kp)\,=\,e^{-\frac{t^{2}}{2{\ks}^{2}}}\,\cdot\,\sin(2\pi\kf\,\cdot\,t\,+\,\phi_i)
-    \label{eq:gabor}
-\end{equation}
-%
-$\rightarrow$ Separate convolution with each member of the kernel set
-%
+The ascending neurons extract and encode a number of different features of the
+preprocessed signal. As a population, they hence represent the signal in a
+higher-dimensional space than the preceding receptor neurons and local
+interneurons. Each ascending neuron is assumed to scan the signal for a
+specific template pattern, which can be thought of as a kernel of a particular
+structure and on a particular time scale. This process, known as template
+matching, can be modelled as a convolution
 \begin{equation}
    c_i(t)\,=\,\adapt(t)\,*\,k_i(t)
    = \infint \adapt(\tau)\,\cdot\,k_i(t\,-\,\tau)\,d\tau
    \label{eq:conv}
 \end{equation}
-%
+of the intensity-adapted envelope $\adapt(t)$ with a kernel $k_i(t)$ per
+ascending neuron. We used Gabor kernels as basis functions for creating
+different preferred patterns. An arbitrary one-dimensional, real Gabor kernel
+is generated by multiplication of a Gaussian envelope and a sinusoidal carrier
+\begin{equation}
+    k_i(t,\,\kwi,\,\kfi,\,\kpi)\,=\,e^{-\frac{t^{2}}{2{\kwi}^{2}}}\,\cdot\,\sin(\kfi\,t\,+\,\kpi), \qquad \kfi\,=\,2\pi f_{sin}
+    \label{eq:gabor}
+\end{equation}
+with Gaussian standard deviation or kernel width $\kwi$, carrier frequency
+$\kfi$, and carrier phase $\kpi$. Different combinations of $\kwi$, $\kfi$, and
+$\kpi$ result in Gabor kernels with different lobe number $\kni$ and sign
+$\ksi$. If the function space is constrained to only include mirror- or
+point-symmetric Gabor kernels, frequency $\kf$ is related to lobe number $\kn$
+by
+\begin{equation}
+    \kp(\kn,\,\ks)\,=\,0.5\,\cdot\,(1\,-\,\text{mod}[\kn,\,2]\,+\,\ks)
+\end{equation}
+which results in the specific phase values shown in
+Table\,\mbox{\ref{tab:gabor_phases}}.
+
+\FloatBarrier
+\begin{table}[!ht]
+    \centering
+    \captionsetup{width=.55\textwidth}
+    \caption{}
+    \begin{tabular}{|ccc|}
+        \hline
+        sign $\ks$ & even $\kn$ & odd $\kn$\\
+        \hline
+        +1 & $+\pi\,/\,2$ & $\pi$\\
+        -1 & $-\pi\,/\,2$ & $0$\\
+        \hline
+    \end{tabular}
+    \label{tab:gabor_phases}
+\end{table}
+\FloatBarrier
+In order to create a Gabor kernel with a specific lobe number $\kn$ and kernel
+width $\kw$, frequency $\kf$ has to be set to
+\begin{equation}
+    \kf(\kn,\,\kw)\,=\,\frac{\kn}{2\pi\,\kw}
+\end{equation}
+
+\textbf{Stage-specific processing steps and functional approximations:}
+
 Thresholding nonlinearity in ascending neurons (or further downstream)\\
 - Binarization of AN response traces into "relevant" vs. "irrelevant"\\
 $\rightarrow$ Shifted Heaviside step-function $\nl$ (or steep sigmoid threshold?)