Syncing to home.

This commit is contained in:
j-hartling
2025-12-09 15:51:27 +01:00
parent 61a8817a39
commit 8732881c78
12 changed files with 279 additions and 131 deletions

128
main.tex
View File

@@ -1,6 +1,7 @@
\documentclass[a4paper, 12pt]{article}
\usepackage[left=2.5cm,right=2.5cm,top=2cm,bottom=2cm,includeheadfoot]{geometry}
\usepackage[onehalfspacing]{setspace}
\usepackage{graphicx}
\usepackage{svg}
\usepackage{import}
@@ -11,11 +12,16 @@
\usepackage{amssymb}
\usepackage[separate-uncertainty=true, locale=DE]{siunitx}
\sisetup{output-exponent-marker=\ensuremath{\mathrm{e}}}
% \usepackage[capitalize]{cleveref}
% \crefname{figure}{Fig.}{Figs.}
% \crefname{equation}{Eq.}{Eqs.}
% \creflabelformat{equation}{#2#1#3}
\usepackage[
backend=biber,
style=authoryear,
mincitenames=1,
maxcitenames=2
pluralothers=true,
maxcitenames=1,
mincitenames=1
]{biblatex}
\addbibresource{cite.bib}
@@ -26,11 +32,26 @@
\begin{document}
\maketitle{}
% Text references and citations:
\newcommand{\bcite}[1]{\mbox{\cite{#1}}}
% \newcommand{\fref}[1]{\mbox{\cref{#1}}}
% \newcommand{\fref}[1]{\mbox{Fig.\,\ref{#1}}}
% \newcommand{\eref}[1]{\mbox{\cref{#1}}}
% \newcommand{\eref}[1]{\mbox{Eq.\,\ref{#1}}}
% Math shorthands - Standard symbols:
\newcommand{\dec}{\log_{10}} % Logarithm base 10
\newcommand{\infint}{\int_{-\infty}^{+\infty}} % Indefinite integral
% Math shorthands - Spectral filtering:
\newcommand{\bp}{h_{\text{BP}}(t)} % Bandpass filter function
\newcommand{\lp}{h_{\text{LP}}(t)} % Lowpass filter function
\newcommand{\hp}{h_{\text{HP}}(t)} % Highpass filter function
\newcommand{\fc}{f_{\text{cut}}} % Filter cutoff frequency
\newcommand{\tlp}{T_{\text{LP}}} % Lowpass filter averaging interval
\newcommand{\thp}{T_{\text{HP}}} % Highpass filter adaptation interval
% Math shorthands - Early representations:
\newcommand{\raw}{x} % Placeholder input signal
\newcommand{\filt}{\raw_{\text{filt}}} % Bandpass-filtered signal
\newcommand{\env}{\raw_{\text{env}}} % Signal envelope
@@ -38,18 +59,18 @@
\newcommand{\dbref}{\raw_{\text{ref}}} % Decibel reference intensity
\newcommand{\adapt}{\raw_{\text{adapt}}} % Adapted signal
\newcommand{\dec}{\log_{10}} % Logarithm base 10
\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song signal variance
\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise signal variance
\newcommand{\infint}{\int_{-\infty}^{+\infty}} % Indefinite integral
% Math shorthands - Kernel parameters:
\newcommand{\ks}{\sigma_i} % Gabor kernel width
\newcommand{\kf}{f_i} % Gabor kernel frequency
\newcommand{\kp}{\phi_i} % Gabor kernel phase
% Math shorthands - Threshold nonlinearity:
\newcommand{\thr}{\Theta_i} % Step function threshold value
\newcommand{\nl}{H(c_i\,-\,\thr)} % Shifted Heaviside step function
\newcommand{\bi}{b_{i,\Theta}} % Single threshold-constrained binary response
\newcommand{\feat}{f_{i,\Theta}} % Single threshold-constrained feature
\newcommand{\thp}{T_{\text{HP}}} % Highpass filter adaptation interval
\newcommand{\tlp}{T_{\text{LP}}} % Lowpass filter averaging interval
% Math shorthands - Minor symbols and helpers:
\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song signal variance
\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise signal variance
\newcommand{\pc}{p(c_i,\,T)} % Probability density (general interval)
\newcommand{\pclp}{p(c_i,\,\tlp)} % Probability density (lowpass interval)
@@ -126,42 +147,43 @@ $\rightarrow$ More general, simpler, unfitted formalized Gabor filter bank
\subsection{Population-driven signal pre-processing}
Grasshoppers receive airborne sound waves by a tympanal organ at each side of
the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a mechanical resonance filter:
Vibrations of specific frequencies are focused on different membrane areas,
while other frequencies are attenuated~(\mbox{\cite{michelsen1971frequency}};
\mbox{\cite{windmill2008time}}; \mbox{\cite{malkin2014energy}}). This
processing step can be approximated by an initial bandpass filter
the thorax~(Fig.\,\ref{fig:pathway}a). The tympanal membrane acts as a
mechanical resonance filter, that focuses vibrations of specific frequencies on
different membrane areas while attenuating
others~(\bcite{michelsen1971frequency}; \bcite{windmill2008time};
\bcite{malkin2014energy}). This processing step can be approximated by an
initial bandpass filter
\begin{equation}
\filt(t)\,=\,\raw(t)\,*\,\bp, \qquad \fc\,=\,5\,\text{kHz},\,30\,\text{kHz}
\label{eq:bandpass}
\end{equation}
applied to the acoustic input signal $\raw(t)$. The auditory receptor neurons
connect directly to the tympanal membrane and transduce mechanical vibrations
into electro-chemical potentials. The receptor population is substrate to
several known signal processing steps. First, the receptors extract
the signal envelope~(\mbox{\cite{machens2001discrimination}}), which likely
involves a rectifying nonlinearity~(\mbox{\cite{machens2001representation}}).
This can be modelled as full-wave rectification followed by lowpass filtering
connect directly to the tympanal membrane. Besides performing the
mechano-electrical transduction, the receptor population further is substrate
to several known processing steps. First, the receptors extract the signal
envelope~(\bcite{machens2001discrimination}), which likely involves a
rectifying nonlinearity~(\bcite{machens2001representation}). This can be
modelled as full-wave rectification followed by lowpass filtering
\begin{equation}
\env(t)\,=\,|\filt(t)|\,*\,\lp, \qquad \fc\,=\,500\,\text{Hz}
\label{eq:env}
\end{equation}
of the tympanal signal $\filt(t)$. Furthermore, the receptors exhibit a
sigmoidal response curve over logarithmically compressed intensity
levels~(\mbox{\cite{suga1960peripheral}}; \mbox{\cite{gollisch2002energy}}). In
the model, logarithmic compression is achieved by conversion to decibel scale
levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model,
logarithmic compression is achieved by conversion to decibel scale
\begin{equation}
\db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max[\env(t)]
\label{eq:log}
\end{equation}
relative to the maximum intensity $\dbref$ of the signal envelope $\env(t)$.
Next, the axons of the receptor neurons project into the metathoracic ganglion,
where they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the
auditory receptors~(\mbox{\cite{fisch2012channel}}) and the subsequent
interneurons~(\mbox{\cite{clemens2010intensity}}) display spike-frequency
adaptation.
The axons of the receptor neurons project into the metathoracic ganglion, where
they synapse onto local interneurons~(Fig.\,\ref{fig:pathway}b). Both the local
interneurons~(\bcite{hildebrandt2009origin}; \bcite{clemens2010intensity}) and,
to a lesser extent, the receptors themselves~(\bcite{fisch2012channel}) display
spike-frequency adaptation in response to sustained stimulation.
This behavior is crucial to render subsequent signal representations invariant
to variations in sound intensity.
"Pre-split portion" of the auditory pathway:\\
@@ -205,10 +227,10 @@ $\rightarrow$ Individual neuron-specific response traces from this stage onwards
Template matching by individual ANs\\
- Filter base (STA approximations): Set of Gabor kernels\\
- Gabor parameters: $\sigma, \phi, f$ $\rightarrow$ Determines kernel sign and lobe number
- Gabor parameters: $\ks, \kp, \kf$ $\rightarrow$ Determines kernel sign and lobe number
%
\begin{equation}
k(t)\,=\,e^{-\frac{t^{2}}{2\sigma^{2}}}\,\cdot\,\sin(2\pi f t\,+\,\phi)
k_i(t,\,\ks,\,\kf,\,\kp)\,=\,e^{-\frac{t^{2}}{2{\ks}^{2}}}\,\cdot\,\sin(2\pi\kf\,\cdot\,t\,+\,\phi_i)
\label{eq:gabor}
\end{equation}
%
@@ -225,7 +247,7 @@ Thresholding nonlinearity in ascending neurons (or further downstream)\\
$\rightarrow$ Shifted Heaviside step-function $\nl$ (or steep sigmoid threshold?)
%
\begin{equation}
\bi(t)\,=\,\begin{cases}
b_i(t,\,\thr)\,=\,\begin{cases}
\;1, \quad c_i(t)\,>\,\thr\\
\;0, \quad c_i(t)\,\leq\,\thr
\end{cases}
@@ -239,7 +261,7 @@ of feature values $\rightarrow$ Clusters in high-dimensional feature space\\
$\rightarrow$ Lowpass filter 1 Hz
%
\begin{equation}
\feat(t)\,=\,\bi(t)\,*\,\lp, \qquad \fc\,=\,1\,\text{Hz}
f_i(t)\,=\,b_i(t)\,*\,\lp, \qquad \fc\,=\,1\,\text{Hz}
\label{eq:lowpass}
\end{equation}
%
@@ -273,7 +295,7 @@ $\env(t)$ with ($\alpha>0$) and without ($\alpha=0$) song signal $s(t)$, assumin
\begin{equation}
\begin{split}
\db(t)\,&=\,\log \frac{\alpha\,\cdot\,s(t)\,+\,\eta(t)}{\dbref}\\
&=\,\log \frac{\alpha}{\dbref}\,+\,\log \big[s(t)\,+\,\frac{\eta(t)}{\alpha}\big]
&=\,\log \frac{\alpha}{\dbref}\,+\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
\end{split}
\label{eq:toy_log}
\end{equation}
@@ -290,7 +312,7 @@ interval $\thp$ ($0 \ll \thp < \frac{1}{\fc}$)
%
\begin{equation}
\begin{split}
\adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log \big[s(t)\,+\,\frac{\eta(t)}{\alpha}\big]
\adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
\end{split}
\label{eq:toy_highpass}
\end{equation}
@@ -316,7 +338,7 @@ $\rightarrow$ Recurring trade-off: Equalizing signal intensity vs preserving ini
\subsection{Threshold nonlinearity \& temporal averaging}
Convolved $c_i(t)$ $\xrightarrow{\nl}$ Binary $\bi(t)$ $\xrightarrow{\lp}$ Feature $\feat(t)$
Convolved $c_i(t)$ $\xrightarrow{\nl}$ Binary $b_i(t)$ $\xrightarrow{\lp}$ Feature $f_i(t)$
\textbf{Thresholding component:}\\
- Within an observed time interval $T$, $c_i(t)$ follows probability density $\pc$\\
@@ -337,29 +359,29 @@ of time $T_1$ where $c_i(t)>\thr$ to total time $T$ due to normalization of $\pc
\end{equation}
%
\textbf{Averaging component:}\\
- Lowpass filter over binary response $\bi(t)$ (Eq.\,\ref{eq:lowpass}) can be
- Lowpass filter over binary response $b_i(t)$ (Eq.\,\ref{eq:lowpass}) can be
approximated as temporal averaging over a suitable time interval $\tlp$ ($\tlp > \frac{1}{\fc}$)\\
- Within $\tlp$, $\bi(t)$ takes a value of 1 ($c_i(t)>\thr$) for time $T_1$ ($T_1+T_0=\tlp$)
- Within $\tlp$, $b_i(t)$ takes a value of 1 ($c_i(t)>\thr$) for time $T_1$ ($T_1+T_0=\tlp$)
%
\begin{equation}
\feat(t)\,\approx\,\frac{1}{\tlp} \int_{t}^{t\,+\,\tlp} \bi(\tau)\,d\tau\,=\,\frac{T_1}{\tlp}
f_i(t)\,\approx\,\frac{1}{\tlp} \int_{t}^{t\,+\,\tlp} b_i(\tau)\,d\tau\,=\,\frac{T_1}{\tlp}
\label{eq:feat_avg}
\end{equation}
%
$\rightarrow$ Temporal averaging over $\bi(t)\in[0,1]$ (Eq.\,\ref{eq:binary}) gives
$\rightarrow$ Temporal averaging over $b_i(t)\in[0,1]$ (Eq.\,\ref{eq:binary}) gives
ratio of time $T_1$ where $c_i(t)>\thr$ to total averaging interval $\tlp$\\
$\rightarrow$ Feature $\feat(t)$ approximately represents supra-threshold fraction of $\tlp$
$\rightarrow$ Feature $f_i(t)$ approximately represents supra-threshold fraction of $\tlp$
\textbf{Combined result:}\\
- Feature $\feat(t)$ can be linked to the distribution of $c_i(t)$ using Eqs.\,\ref{eq:pdf_split} \& \ref{eq:feat_avg}
- Feature $f_i(t)$ can be linked to the distribution of $c_i(t)$ using Eqs.\,\ref{eq:pdf_split} \& \ref{eq:feat_avg}
%
\begin{equation}
\feat(t)\,\approx\,\int_{\thr}^{+\infty} \pclp\,dc_i\,=\,P(c_i\,>\,\thr,\,\tlp)
f_i(t)\,\approx\,\int_{\thr}^{+\infty} \pclp\,dc_i\,=\,P(c_i\,>\,\thr,\,\tlp)
\label{eq:feat_prop}
\end{equation}
%
$\rightarrow$ Because the integral over a probability density is a cumulative
probability, the value of feature $\feat(t)$ (temporal compression of $\bi(t)$)
probability, the value of feature $f_i(t)$ (temporal compression of $b_i(t)$)
at every time point $t$ signifies the probability that convolution output
$c_i(t)$ exceeds the threshold value $\thr$ during the corresponding averaging
interval $\tlp$
@@ -369,25 +391,25 @@ interval $\tlp$
template waveform $k_i(t)$ and signal $\adapt(t)$ centered at time point $t$\\
$\rightarrow$ Based on amplitudes on a graded scale
- Feature $\feat(t)$ quantifies the probability that amplitudes of $c_i(t)$
- Feature $f_i(t)$ quantifies the probability that amplitudes of $c_i(t)$
exceed threshold value $\thr$ within interval $\tlp$ around time point $t$\\
$\rightarrow$ Based on binned amplitudes corresponding to one of two categorical states
$\rightarrow$ Deliberate loss of precise amplitude information\\
$\rightarrow$ Emphasis on temporal structure (ratio of $T_1$ over $\tlp$)
- Thresholding of $c_i(t)$ and subsequent temporal averaging of $\bi(t)$ to
obtain $\feat(t)$ constitutes a remapping of an amplitude-encoding quantity into a
- Thresholding of $c_i(t)$ and subsequent temporal averaging of $b_i(t)$ to
obtain $f_i(t)$ constitutes a remapping of an amplitude-encoding quantity into a
duty cycle-encoding quantity, mediated by threshold function $\nl$
- Different scales of $c_i(t)$ can result in similar $T_1$ segments depending
on the magnitude of the derivative of $c_i(t)$ in temporal proximity to time
points at which $c_i(t)$ crosses threshold value $\thr$\\
$\rightarrow$ The steeper the slope of $c_i(t)$, the less $T_1$ changes with scale variations\\
$\rightarrow$ If $T_1$ is invariant to scale variation in $c_i(t)$, then so is $\feat(t)$
$\rightarrow$ If $T_1$ is invariant to scale variation in $c_i(t)$, then so is $f_i(t)$
- Suggests a relatively simple rule for optimal choice of threshold value $\thr$:\\
$\rightarrow$ Find amplitude $c_i$ that maximizes absolute derivative of $c_i(t)$ over time\\
$\rightarrow$ Optimal with respect to intensity invariance of $\feat(t)$, not necessarily for
$\rightarrow$ Optimal with respect to intensity invariance of $f_i(t)$, not necessarily for
other criteria such as song-noise separation or diversity between features
- Nonlinear operations can be used to detach representations from graded physical
@@ -396,9 +418,9 @@ stimulus (to fasciliate categorical behavioral decision-making?):\\
$\rightarrow$ Closely following the AM of the acoustic stimulus\\
2) Quantify relevant stimulus properties on a graded scale: $c_i(t)$\\
$\rightarrow$ More decorrelated representation, compared to prior stages\\
3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $\bi(t)$\\
3) Nonlinearity: Distinguish between "relevant vs irrelevant" values: $b_i(t)$\\
$\rightarrow$ Trading a graded scale for two or more categorical states\\
4) Represent stimulus properties under relevance constraint: $\feat(t)$\\
4) Represent stimulus properties under relevance constraint: $f_i(t)$\\
$\rightarrow$ Graded again but highly decorrelated from the acoustic stimulus\\
5) Categorical behavioral decision-making requires further nonlinearities\\
$\rightarrow$ Parameters of a behavioral response may be graded (e.g. approach speed),