Began writing results :)

2026-02-23 16:48:53 +01:00
parent 1ea2081eab
commit c700e1723c
10 changed files with 197 additions and 118 deletions
--- a/main.tex
+++ b/main.tex
@@ -86,9 +86,12 @@
 \newcommand{\thr}{\Theta_i} % Step function threshold value
 \newcommand{\nl}{H(c_i\,-\,\thr)} % Shifted Heaviside step function

-% Math shorthands - Minor symbols and helpers:
-\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song signal variance
-\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise signal variance
+% Math shorthands - Intensity invariance analysis:
+\newcommand{\soc}{s} % Song component of synthetic mixture
+\newcommand{\noc}{\eta} % Noise component of synthetic mixture
+\newcommand{\sca}{\alpha} % Multiplicative scale of song component
+\newcommand{\svar}{\sigma_{\text{s}}^{2}} % Song component variance
+\newcommand{\nvar}{\sigma_{\eta}^{2}} % Noise component variance
 \newcommand{\pc}{p(c_i,\,T)} % Probability density (general interval)
 \newcommand{\pclp}{p(c_i,\,\tlp)} % Probability density (lowpass interval)

@@ -387,7 +390,7 @@ sigmoidal response curve over logarithmically compressed intensity
 levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model
 pathway, logarithmic compression is achieved by conversion to decibel scale
 \begin{equation}
-    \db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max[\env(t)]
+    \db(t)\,=\,10\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,\max\big[\env(t)\big]
    \label{eq:log}
 \end{equation}
 relative to the maximum intensity $\dbref$ of the signal envelope $\env(t)$.
@@ -492,12 +495,12 @@ the left of the two central lobes (odd kernels).
    \label{tab:gabor_phases}
 \end{table}
 \FloatBarrier
-These four groups of Gabor kernels allow for the extraction of different types
-of signal features, such as the presence of peaks (even, $+$), troughs (even,
-$-$), onsets (odd, $+$), and offsets (odd, $-$) at various time scales.
+These four major groups of Gabor kernels allow for the extraction of different
+types of signal features, such as the presence of peaks (even, $+$), troughs
+(even, $-$), onsets (odd, $+$), and offsets (odd, $-$) at various time scales.
 Following the convolutional template matching, each kernel-specific response
-$c_i(t)$ is passed through a shifted Heaviside step-function $\nl$ with threshold
-value $\thr$ to obtain a binary response
+$c_i(t)$ is passed through a shifted Heaviside step-function $\nl$ with
+threshold value $\thr$ to obtain a binary response
 \begin{equation}
    b_i(t,\,\thr)\,=\,\begin{cases}
        \;1, \quad c_i(t)\,>\,\thr\\
@@ -528,6 +531,10 @@ can be read out by a simple linear classifier.
    \includegraphics[width=\textwidth]{figures/fig_feat_stages.pdf}
    \caption{\textbf{Representations of a song of \textit{O. rufipes} during
                     the feature extraction stage.}
+                     Different colors indicate Gabor kernels with different
+                     lobe number $\kn$ and sign, with lighter colors for higher
+                     $\kn$~($1\,\leq\,\kn\,\leq\,4$; both $+$ and $-$ per $\kn$;
+                     two kernel widths $\kw$ of $4\,$ms and $32\,$ms per sign).
                     \textbf{a}:~Kernel-specific filter responses.
                     \textbf{b}:~Binary responses.
                     \textbf{c}:~Finalized features.
@@ -536,55 +543,62 @@ can be read out by a simple linear classifier.
 \end{figure}
 \FloatBarrier

-\section{Two mechanisms driving the emergence of intensity-invariant song representation}
+\section{Two mechanisms driving the emergence of intensity-invariant song representations}

-\textbf{Definition of invariance (general, systemic):}\\
-Invariance = Property of a system to maintain a stable output with respect to a
-set of relevant input parameters (variation to be represented) but irrespective
-of one or more other parameters (variation to be discarded)
-$\rightarrow$ Selective input-output decorrelation
+% Still missing the SNR analysis. Should be able to write around it for now.
+The robustness of song recognition is tied to the degree of intensity
+invariance of the finalized feature representation. Ideally, the values of each
+feature should depend only on the relative amplitude dynamics of the song
+pattern but not on the overall intensity level of the song. In the grasshopper,
+the emergence of intensity-invariant representations along the song recognition
+pathway likely is a distributed process that involves different neuronal
+populations, which raises the question of what the essential computational
+mechanisms are that drive this process. Within the model pathway, we identified
+two key mechanisms that render the song representation more invariant to
+variations in baseline intensity. The two mechanisms each comprise a nonlinear
+signal transformation followed by a linear signal transformation but differ in
+the specific operations and the neural substrate involved, as outlined in the
+following sections.

-\textbf{Definition of intensity invariance (context of neurons and songs):}\\
-Intensity invariance = Time scale-selective sensitivity to certain faster
-amplitude dynamics (song waveform, small-scale AM) and simultaneous
-insensitivity to slower, more sustained amplitude dynamics (transient baseline,
-large-scale AM, current overall intensity level)\\
-$\rightarrow$ Without time scale selectivity, any fully intensity-invariant
-output will be a flat line
+\subsection{Logarithmic compression \& spike-frequency adaptation}

-\subsection{Logarithmic scaling \& spike-frequency adaptation}
-
-Envelope $\env(t)$ $\xrightarrow{\text{dB}}$ Logarithmic $\db(t)$ $\xrightarrow{\hp}$ Adapted $\adapt(t)$
-
- Rewrite signal envelope $\env(t)$ (Eq.\,\ref{eq:env}) as a synthetic mixture:\\
-1) Song signal $s(t)$ ($\svar=1$) with variable multiplicative scale $\alpha\geq0$\\
-2) Fixed-scale additive noise $\eta(t)$ ($\nvar=1$)
-%
+The first emergence of intensity invariance along the model pathway occurs
+during the preprocessing stage, in the transition from the signal envelope
+$\env(t)$ to the logarithmically scaled envelope $\db(t)$ and then to the
+intensity-adapted envelope $\adapt(t)$. In order to disentangle the interplay
+of logarithmic compression and adaptation, we can rewrite
+$\env(t)$~(Eq.\,\ref{eq:env}) as synthetic mixture
 \begin{equation}
-    \env(t)\,=\,\alpha\,\cdot\,s(t)\,+\,\eta(t),\qquad \env(t)\,>\,0\enspace\forall\enspace t\,\in\,\mathbb{R}
+    \env(t)\,=\,\sca\,\cdot\,\soc(t)\,+\,\noc(t), \qquad \env(t)\,>\,0\enspace\forall\enspace t\,\in\,\mathbb{R}
    \label{eq:toy_env}
 \end{equation}
-%
- Signal-to-noise ratio (SNR): Ratio of variances of synthetic mixture
-$\env(t)$ with ($\alpha>0$) and without ($\alpha=0$) song signal $s(t)$, assuming $s(t)\perp\eta(t)$
-%
+of a song component $\soc(t)$ with variable multiplicative scale $\sca\geq0$
+and a fixed-scale noise component $\noc(t)$. Both $\soc(t)$ and $\noc(t)$ are
+assumed to have unit variance~($\svar=\nvar=1$). If $\soc(t)$ and $\noc(t)$ are
+uncorrelated~($\soc(t)\perp\noc(t)$), the signal-to-noise ratio (SNR) of the
+synthetic $\env(t)$ with ($\sca>0$) and without ($\sca=0$) song component
+$\soc(t)$ is given by
 \begin{equation}
    \text{SNR}\,=\,\frac{\sigma_{s+\eta}^{2}}{\nvar}\,=\,\frac{\alpha^{2}\,\cdot\,\svar\,+\,\nvar}{\nvar}\,=\,\alpha^{2}\,+\,1
    \label{eq:toy_snr}
 \end{equation}
-%
-\textbf{Logarithmic component:}\\
- Simplify decibel transformation (Eq.\,\ref{eq:log}) and apply to synthetic $\env(t)$\\
- Isolate scale $\alpha$ and reference $\dbref$ using logarithm product/quotient laws
-%
+When simplifying the decibel transformation~(Eq.\,\ref{eq:log}), the logarithmically
+scaled envelope $\db(t)$ can be expressed as a sum of two logarithmic terms
 \begin{equation}
    \begin{split}
        \db(t)\,&=\,\log \frac{\alpha\,\cdot\,s(t)\,+\,\eta(t)}{\dbref}\\
-        &=\,\log \frac{\alpha}{\dbref}\,+\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
+        &=\,\log \frac{\alpha}{\dbref}\,+\,\log \left[s(t)\,+\,\frac{\eta(t)}{\alpha}\right]
    \end{split}
    \label{eq:toy_log}
 \end{equation}
-%
+
+
+
+
+\textbf{Logarithmic component:}\\
+- Simplify decibel transformation (Eq.\,\ref{eq:log}) and apply to synthetic $\env(t)$\\
+- Isolate scale $\alpha$ and reference $\dbref$ using logarithm product/quotient laws
+
 $\rightarrow$ In log-space, a multiplicative scaling factor becomes additive\\
 $\rightarrow$ Allows for the separation of song signal $s(t)$ and its scale $\alpha$\\
 $\rightarrow$ Introduces scaling of noise term $\eta(t)$ by the inverse of $\alpha$\\
@@ -597,7 +611,7 @@ interval $\thp$ ($0 \ll \thp < \frac{1}{\fc}$)
 %
 \begin{equation}
    \begin{split}
-    \adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log b_ig[s(t)\,+\,\frac{\eta(t)}{\alpha}b_ig]
+    \adapt(t)\,\approx\,\db(t)\,-\,\log \frac{\alpha}{\dbref}\,=\,\log\left[s(t)\,+\,\frac{\eta(t)}{\alpha}\right]
    \end{split}
    \label{eq:toy_highpass}
 \end{equation}
@@ -715,6 +729,20 @@ initiation of one behavior over another is categorical (e.g. approach/stay)

 \section{Conclusions \& outlook}

+\textbf{Definition of invariance (general, systemic):}\\
+Invariance = Property of a system to maintain a stable output with respect to a
+set of relevant input parameters (variation to be represented) but irrespective
+of one or more other parameters (variation to be discarded)
+$\rightarrow$ Selective input-output decorrelation
+
+\textbf{Definition of intensity invariance (context of neurons and songs):}\\
+Intensity invariance = Time scale-selective sensitivity to certain faster
+amplitude dynamics (song waveform, small-scale AM) and simultaneous
+insensitivity to slower, more sustained amplitude dynamics (transient baseline,
+large-scale AM, current overall intensity level)\\
+$\rightarrow$ Without time scale selectivity, any fully intensity-invariant
+output will be a flat line
+
 The model pathway includes a rather large number of Gabor kernels compared to
 the 15 to 20 ascending neurons in the grasshopper auditory
 system~(\bcite{stumpner1991auditory}).