Syncing to home.

2025-12-09 15:51:27 +01:00
parent 61a8817a39
commit 8732881c78
12 changed files with 279 additions and 131 deletions
--- a/main.tex
+++ b/main.tex
@@ -1,6 +1,7 @@
 \documentclass[a4paper, 12pt]{article}

 \usepackage[left=2.5cm,right=2.5cm,top=2cm,bottom=2cm,includeheadfoot]{geometry}
+\usepackage[onehalfspacing]{setspace}
 \usepackage{graphicx}
 \usepackage{svg}
 \usepackage{import}
@@ -11,11 +12,16 @@
 \usepackage{amssymb}
 \usepackage[separate-uncertainty=true, locale=DE]{siunitx}
 \sisetup{output-exponent-marker=\ensuremath{\mathrm{e}}}
+% \usepackage[capitalize]{cleveref}
+% \crefname{figure}{Fig.}{Figs.}
+% \crefname{equation}{Eq.}{Eqs.}
+% \creflabelformat{equation}{#2#1#3}
 \usepackage[
    backend=biber,
    style=authoryear,
-    mincitenames=1,
-    maxcitenames=2
+    pluralothers=true,
+    maxcitenames=1,
+    mincitenames=1
    ]{biblatex}
 \addbibresource{cite.bib}

@@ -26,11 +32,26 @@
 \begin{document}
 \maketitle{}

+% Text references and citations:
+\newcommand{\bcite}[1]{\mbox{\cite{#1}}}
+% \newcommand{\fref}[1]{\mbox{\cref{#1}}}
+% \newcommand{\fref}[1]{\mbox{Fig.\,\ref{#1}}}
+% \newcommand{\eref}[1]{\mbox{\cref{#1}}}
+% \newcommand{\eref}[1]{\mbox{Eq.\,\ref{#1}}}
+
+% Math shorthands - Standard symbols:
+\newcommand{\dec}{\log_{10}} % Logarithm base 10
+\newcommand{\infint}{\int_{-\infty}^{+\infty}} % Indefinite integral
+
+% Math shorthands - Spectral filtering:
 \newcommand{\bp}{h_{\text{BP}}(t)} % Bandpass filter function
 \newcommand{\lp}{h_{\text{LP}}(t)} % Lowpass filter function
 \newcommand{\hp}{h_{\text{HP}}(t)} % Highpass filter function
 \newcommand{\fc}{f_{\text{cut}}} % Filter cutoff frequency
+\newcommand{\tlp}{T_{\text{LP}}} % Lowpass filter averaging interval
+\newcommand{\thp}{T_{\text{HP}}} % Highpass filter adaptation interval

+% Math shorthands - Early representations:
 \newcommand{\raw}{x} % Placeholder input signal
 \newcommand{\filt}{\raw_{\text{filt}}} % Bandpass-filtered signal
 \newcommand{\env}{\raw_{\text{env}}} % Signal envelope
@@ -38,18 +59,18 @@
 \newcommand{\dbref}{\raw_{\text{ref}}} % Decibel reference intensity
 \newcommand{\adapt}{\raw_{\text{adapt}}} % Adapted signal

-\newcommand{\dec}{\log_{10}} % Logarithm base 10
-\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song signal variance
-\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise signal variance
-\newcommand{\infint}{\int_{-\infty}^{+\infty}} % Indefinite integral
+% Math shorthands - Kernel parameters:
+\newcommand{\ks}{\sigma_i} % Gabor kernel width
+\newcommand{\kf}{f_i} % Gabor kernel frequency
+\newcommand{\kp}{\phi_i} % Gabor kernel phase
+
+% Math shorthands - Threshold nonlinearity:
 \newcommand{\thr}{\Theta_i} % Step function threshold value
 \newcommand{\nl}{H(c_i\,-\,\thr)} % Shifted Heaviside step function

-\newcommand{\bi}{b_{i,\Theta}} % Single threshold-constrained binary response
-\newcommand{\feat}{f_{i,\Theta}} % Single threshold-constrained feature
-
-\newcommand{\thp}{T_{\text{HP}}} % Highpass filter adaptation interval
-\newcommand{\tlp}{T_{\text{LP}}} % Lowpass filter averaging interval
+% Math shorthands - Minor symbols and helpers:
+\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song signal variance
+\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise signal variance
 \newcommand{\pc}{p(c_i,\,T)} % Probability density (general interval)
 \newcommand{\pclp}{p(c_i,\,\tlp)} % Probability density (lowpass interval)

@@ -126,42 +147,43 @@ $\rightarrow$ More general, simpler, unfitted formalized Gabor filter bank
 \subsection{Population-driven signal pre-processing}

 Grasshoppers receive airborne sound waves by a tympanal organ at each side of
-the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a mechanical resonance filter:
-Vibrations of specific frequencies are focused on different membrane areas,
-while other frequencies are attenuated~(\mbox{\cite{michelsen1971frequency}};
-\mbox{\cite{windmill2008time}}; \mbox{\cite{malkin2014energy}}). This
-processing step can be approximated by an initial bandpass filter
+the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a
+mechanical resonance filter, that focuses vibrations of specific frequencies on
+different membrane areas while attenuating
+others~(\bcite{michelsen1971frequency}; \bcite{windmill2008time};
+\bcite{malkin2014energy}). This processing step can be approximated by an
+initial bandpass filter
 \begin{equation}
    \filt(t)\,=\,\raw(t)\,*\,\bp, \qquad \fc\,=\,5\,\text{kHz},\,30\,\text{kHz}
    \label{eq:bandpass}
 \end{equation}
 applied to the acoustic input signal $\raw(t)$. The auditory receptor neurons
-connect directly to the tympanal membrane and transduce mechanical vibrations
-into electro-chemical potentials. The receptor population is substrate to
-several known signal processing steps. First, the receptors extract
-the signal envelope~(\mbox{\cite{machens2001discrimination}}), which likely
-involves a rectifying nonlinearity~(\mbox{\cite{machens2001representation}}).
-This can be modelled as full-wave rectification followed by lowpass filtering
+connect directly to the tympanal membrane. Besides performing the
+mechano-electrical transduction, the receptor population further is substrate
+to several known processing steps. First, the receptors extract the signal
+envelope~(\bcite{machens2001discrimination}), which likely involves a
+rectifying nonlinearity~(\bcite{machens2001representation}). This can be
+modelled as full-wave rectification followed by lowpass filtering
 \begin{equation}
    \env(t)\,=\,|\filt(t)|\,*\,\lp, \qquad \fc\,=\,500\,\text{Hz}
    \label{eq:env}
 \end{equation}
 of the tympanal signal $\filt(t)$. Furthermore, the receptors exhibit a
 sigmoidal response curve over logarithmically compressed intensity
-levels~(\mbox{\cite{suga1960peripheral}}; \mbox{\cite{gollisch2002energy}}). In
-the model, logarithmic compression is achieved by conversion to decibel scale
+levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model,
+logarithmic compression is achieved by conversion to decibel scale
 \begin{equation}
    \db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max[\env(t)]
    \label{eq:log}
 \end{equation}
 relative to the maximum intensity $\dbref$ of the signal envelope $\env(t)$.
-Next, the axons of the receptor neurons project into the metathoracic ganglion,
-where they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the
-auditory receptors~(\mbox{\cite{fisch2012channel}}) and the subsequent
-interneurons~(\mbox{\cite{clemens2010intensity}}) display spike-frequency
-adaptation.
-
-
+The axons of the receptor neurons project into the metathoracic ganglion, where
+they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the local
+interneurons~(\bcite{hildebrandt2009origin}; \bcite{clemens2010intensity}) and,
+to a lesser extent, the receptors themselves~(\bcite{fisch2012channel}) display
+spike-frequency adaptation in response to sustained stimulation.
+This behavior is crucial to render subsequent signal representations invariant
+to variations in sound intensity.


 "Pre-split portion" of the auditory pathway:\\
@@ -205,10 +227,10 @@ $\rightarrow$ Individual neuron-specific response traces from this stage onwards

 Template matching by individual ANs\\
 - Filter base (STA approximations): Set of Gabor kernels\\
- Gabor parameters: $\sigma, \phi, f$ $\rightarrow$ Determines kernel sign and lobe number
+- Gabor parameters: $\ks, \kp, \kf$ $\rightarrow$ Determines kernel sign and lobe number
 %
 \begin{equation}
-    k(t)\,=\,e^{-\frac{t^{2}}{2\sigma^{2}}}\,\cdot\,\sin(2\pi f t\,+\,\phi)
+    k_i(t,\,\ks,\,\kf,\,\kp)\,=\,e^{-\frac{t^{2}}{2{\ks}^{2}}}\,\cdot\,\sin(2\pi\kf\,\cdot\,t\,+\,\phi_i)
    \label{eq:gabor}
 \end{equation}
 %
@@ -225,7 +247,7 @@ Thresholding nonlinearity in ascending neurons (or further downstream)\\
 $\rightarrow$ Shifted Heaviside step-function $\nl$ (or steep sigmoid threshold?)
 %
 \begin{equation}
-    \bi(t)\,=\,\begin{cases}
+    b_i(t,\,\thr)\,=\,\begin{cases}
        \;1, \quad c_i(t)\,>\,\thr\\
        \;0, \quad c_i(t)\,\leq\,\thr
    \end{cases}
@@ -239,7 +261,7 @@ of feature values $\rightarrow$ Clusters in high-dimensional feature space\\
 $\rightarrow$ Lowpass filter 1 Hz
 %
 \begin{equation}
-    \feat(t)\,=\,\bi(t)\,*\,\lp, \qquad \fc\,=\,1\,\text{Hz}
+    f_i(t)\,=\,b_i(t)\,*\,\lp, \qquad \fc\,=\,1\,\text{Hz}
    \label{eq:lowpass}
 \end{equation}
 %
@@ -273,7 +295,7 @@ $\env(t)$ with ($\alpha>0$) and without ($\alpha=0$) song signal $s(t)$, assumin
 \begin{equation}
    \begin{split}
        \db(t)\,&=\,\log \frac{\alpha\,\cdot\,s(t)\,+\,\eta(t)}{\dbref}\\
-        &=\,\log \frac{\alpha}{\dbref}\,+\,\log \big[s(t)\,+\,\frac{\eta(t)}{\alpha}\big]
+        &=\,\log \frac{\alpha}{\dbref}\,+\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
    \end{split}
    \label{eq:toy_log}
 \end{equation}
@@ -290,7 +312,7 @@ interval $\thp$ ($0 \ll \thp < \frac{1}{\fc}$)
 %
 \begin{equation}
    \begin{split}
-    \adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log \big[s(t)\,+\,\frac{\eta(t)}{\alpha}\big]
+    \adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
    \end{split}
    \label{eq:toy_highpass}
 \end{equation}
@@ -316,7 +338,7 @@ $\rightarrow$ Recurring trade-off: Equalizing signal intensity vs preserving ini

 \subsection{Threshold nonlinearity \& temporal averaging}

-Convolved $c_i(t)$ $\xrightarrow{\nl}$ Binary $\bi(t)$ $\xrightarrow{\lp}$ Feature $\feat(t)$
+Convolved $c_i(t)$ $\xrightarrow{\nl}$ Binary $b_i(t)$ $\xrightarrow{\lp}$ Feature $f_i(t)$

 \textbf{Thresholding component:}\\
 - Within an observed time interval $T$, $c_i(t)$ follows probability density $\pc$\\
@@ -337,29 +359,29 @@ of time $T_1$ where $c_i(t)>\thr$ to total time $T$ due to normalization of $\pc
 \end{equation}
 %
 \textbf{Averaging component:}\\
- Lowpass filter over binary response $\bi(t)$ (Eq.\,\ref{eq:lowpass}) can be
+- Lowpass filter over binary response $b_i(t)$ (Eq.\,\ref{eq:lowpass}) can be
 approximated as temporal averaging over a suitable time interval $\tlp$ ($\tlp > \frac{1}{\fc}$)\\
- Within $\tlp$, $\bi(t)$ takes a value of 1 ($c_i(t)>\thr$) for time $T_1$ ($T_1+T_0=\tlp$)
+- Within $\tlp$, $b_i(t)$ takes a value of 1 ($c_i(t)>\thr$) for time $T_1$ ($T_1+T_0=\tlp$)
 %
 \begin{equation}
-    \feat(t)\,\approx\,\frac{1}{\tlp} \int_{t}^{t\,+\,\tlp} \bi(\tau)\,d\tau\,=\,\frac{T_1}{\tlp}
+    f_i(t)\,\approx\,\frac{1}{\tlp} \int_{t}^{t\,+\,\tlp} b_i(\tau)\,d\tau\,=\,\frac{T_1}{\tlp}
    \label{eq:feat_avg}
 \end{equation}
 %
-$\rightarrow$ Temporal averaging over $\bi(t)\in[0,1]$ (Eq.\,\ref{eq:binary}) gives
+$\rightarrow$ Temporal averaging over $b_i(t)\in[0,1]$ (Eq.\,\ref{eq:binary}) gives
 ratio of time $T_1$ where $c_i(t)>\thr$ to total averaging interval $\tlp$\\
-$\rightarrow$ Feature $\feat(t)$ approximately represents supra-threshold fraction of $\tlp$
+$\rightarrow$ Feature $f_i(t)$ approximately represents supra-threshold fraction of $\tlp$

 \textbf{Combined result:}\\
- Feature $\feat(t)$ can be linked to the distribution of $c_i(t)$ using Eqs.\,\ref{eq:pdf_split} \& \ref{eq:feat_avg}
+- Feature $f_i(t)$ can be linked to the distribution of $c_i(t)$ using Eqs.\,\ref{eq:pdf_split} \& \ref{eq:feat_avg}
 %
 \begin{equation}
-    \feat(t)\,\approx\,\int_{\thr}^{+\infty} \pclp\,dc_i\,=\,P(c_i\,>\,\thr,\,\tlp)
+    f_i(t)\,\approx\,\int_{\thr}^{+\infty} \pclp\,dc_i\,=\,P(c_i\,>\,\thr,\,\tlp)
    \label{eq:feat_prop}
 \end{equation}
 %
 $\rightarrow$ Because the integral over a probability density is a cumulative
-probability, the value of feature $\feat(t)$ (temporal compression of $\bi(t)$)
+probability, the value of feature $f_i(t)$ (temporal compression of $b_i(t)$)
 at every time point $t$ signifies the probability that convolution output
 $c_i(t)$ exceeds the threshold value $\thr$ during the corresponding averaging
 interval $\tlp$ 
@@ -369,25 +391,25 @@ interval $\tlp$
 template waveform $k_i(t)$ and signal $\adapt(t)$ centered at time point $t$\\
 $\rightarrow$ Based on amplitudes on a graded scale

- Feature $\feat(t)$ quantifies the probability that amplitudes of $c_i(t)$
+- Feature $f_i(t)$ quantifies the probability that amplitudes of $c_i(t)$
 exceed threshold value $\thr$ within interval $\tlp$ around time point $t$\\
 $\rightarrow$ Based on binned amplitudes corresponding to one of two categorical states
 $\rightarrow$ Deliberate loss of precise amplitude information\\
 $\rightarrow$ Emphasis on temporal structure (ratio of $T_1$ over $\tlp$)

- Thresholding of $c_i(t)$ and subsequent temporal averaging of $\bi(t)$ to
-obtain $\feat(t)$ constitutes a remapping of an amplitude-encoding quantity into a
+- Thresholding of $c_i(t)$ and subsequent temporal averaging of $b_i(t)$ to
+obtain $f_i(t)$ constitutes a remapping of an amplitude-encoding quantity into a
 duty cycle-encoding quantity, mediated by threshold function $\nl$

 - Different scales of $c_i(t)$ can result in similar $T_1$ segments depending
 on the magnitude of the derivative of $c_i(t)$ in temporal proximity to time
 points at which $c_i(t)$ crosses threshold value $\thr$\\
 $\rightarrow$ The steeper the slope of $c_i(t)$, the less $T_1$ changes with scale variations\\
-$\rightarrow$ If $T_1$ is invariant to scale variation in $c_i(t)$, then so is $\feat(t)$
+$\rightarrow$ If $T_1$ is invariant to scale variation in $c_i(t)$, then so is $f_i(t)$

 - Suggests a relatively simple rule for optimal choice of threshold value $\thr$:\\
 $\rightarrow$ Find amplitude $c_i$ that maximizes absolute derivative of $c_i(t)$ over time\\
-$\rightarrow$ Optimal with respect to intensity invariance of $\feat(t)$, not necessarily for
+$\rightarrow$ Optimal with respect to intensity invariance of $f_i(t)$, not necessarily for
 other criteria such as song-noise separation or diversity between features

 - Nonlinear operations can be used to detach representations from graded physical
@@ -396,9 +418,9 @@ stimulus (to fasciliate categorical behavioral decision-making?):\\
 $\rightarrow$ Closely following the AM of the acoustic stimulus\\
 2) Quantify relevant stimulus properties on a graded scale: $c_i(t)$\\
 $\rightarrow$ More decorrelated representation, compared to prior stages\\
-3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $\bi(t)$\\
+3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $b_i(t)$\\
 $\rightarrow$ Trading a graded scale for two or more categorical states\\
-4) Represent stimulus properties under relevance constraint: $\feat(t)$\\
+4) Represent stimulus properties under relevance constraint: $f_i(t)$\\
 $\rightarrow$ Graded again but highly decorrelated from the acoustic stimulus\\
 5) Categorical behavioral decision-making requires further nonlinearities\\
 $\rightarrow$ Parameters of a behavioral response may be graded (e.g. approach speed),