ALMOST finished the methods section.
This commit is contained in:
272
main.tex
272
main.tex
@@ -79,18 +79,14 @@
|
||||
\newcommand{\kf}{\omega} % Unspecific Gabor kernel frequency
|
||||
\newcommand{\kp}{\phi} % Unspecific Gabor kernel phase
|
||||
\newcommand{\kn}{n} % Unspecific Gabor kernel lobe number
|
||||
% \newcommand{\ks}{s} % Unspecific Gabor kernel sign
|
||||
\newcommand{\kwi}{\kw_i} % Specific Gabor kernel width
|
||||
\newcommand{\kfi}{\kf_i} % Specific Gabor kernel frequency
|
||||
\newcommand{\kpi}{\kp_i} % Specific Gabor kernel phase
|
||||
\newcommand{\kni}{\kn_i} % Specific Gabor kernel lobe number
|
||||
% \newcommand{\ksi}{\ks_i} % Specific Gabor kernel sign
|
||||
|
||||
% Math shorthands - Auxiliary kernel parameters:
|
||||
\newcommand{\fsin}{f_{\text{sin}}} % Carrier frequency
|
||||
\newcommand{\rh}{h_{\text{rel}}} % Relative Gaussian height for FWRH
|
||||
\newcommand{\fwrh}{\text{FWRH}} % Gaussian full-width at relative height
|
||||
\newcommand{\off}{\beta_0} % Offset for linear frequency approximation
|
||||
\newcommand{\fdrm}{\text{FDRM}} % Gaussian full duration relative to maximum
|
||||
\newcommand{\rh}{h_{\text{rel}}} % Relative Gaussian height for FDRM calculation
|
||||
|
||||
% Math shorthands - Thresholding nonlinearity:
|
||||
\newcommand{\thr}{\Theta_i} % Step function threshold value
|
||||
@@ -287,6 +283,20 @@ approximation by basic mathematical operations. We then elaborate on the key
|
||||
mechanisms that drive the emergence of intensity-invariant song representations
|
||||
within the auditory pathway.
|
||||
|
||||
% RIPPED FROM RESULTS, MAYBE INTEGRATE SOMEWHERE HERE:
|
||||
% The robustness of song recognition is tied to the degree of intensity
|
||||
% invariance of the finalized feature representation. Ideally, the values of each
|
||||
% feature should depend only on the relative amplitude dynamics of the song
|
||||
% pattern but not on the overall intensity of the song. In the grasshopper, the
|
||||
% emergence of intensity-invariant representations along the song recognition
|
||||
% pathway likely is a distributed process that involves different neuronal
|
||||
% populations, which raises the question of what the essential computational
|
||||
% mechanisms are that drive this process. Within the model pathway, we identified
|
||||
% two key mechanisms that render the song representation more invariant to
|
||||
% intensity variations. The two mechanisms each comprise a nonlinear signal
|
||||
% transformation followed by a linear signal transformation but differ in the
|
||||
% specific operations involved, as outlined in the following sections.
|
||||
|
||||
% SCRAPPED UNTIL FURTHER NOTICE:
|
||||
% Multi-species, multi-individual communally inhabited environments\\
|
||||
% - Temporal overlap: Simultaneous singing across individuals/species common\\
|
||||
@@ -328,43 +338,38 @@ on PyPi.
|
||||
|
||||
\subsection{Functional model of the grasshopper song recognition pathway}
|
||||
|
||||
% Too long (no splitting, only pruning).
|
||||
The essence of constructing a functional model of a given system is to gain a
|
||||
sufficient understanding of the system's essential structural components and
|
||||
their presumed functional roles; and to then build a formal framework of
|
||||
manageable complexity around these two aspects. Anatomically, the organization
|
||||
of the grasshopper song recognition pathway can be outlined as a feed-forward
|
||||
network of three consecutive neuronal
|
||||
populations~(Fig.\,\mbox{\ref{fig:pathway}a-c}): Peripheral auditory receptor
|
||||
neurons, whose axons enter the ventral nerve cord at the level of the
|
||||
metathoracic ganglion; local interneurons that remain exclusively within the
|
||||
thoracic region of the ventral nerve cord; and ascending neurons projecting
|
||||
from the thoracic region towards the supraesophageal
|
||||
ganglion~(\bcite{rehbein1974structure}; \bcite{rehbein1976auditory};
|
||||
The anatomical organisation of the grasshopper song recognition pathway can be
|
||||
outlined as a feed-forward network of three consecutive neuronal
|
||||
populations~(Fig.\,\ref{fig:pathway}a-c): Peripheral auditory receptor neurons,
|
||||
whose axons enter the ventral nerve cord (VNC) at the level of the metathoracic
|
||||
ganglion; local interneurons that remain exclusively within the thoracic region
|
||||
of the VNC; and ascending neurons projecting from the thoracic region towards
|
||||
the supraesophageal ganglion (SEG), or central
|
||||
brain~(\bcite{rehbein1974structure}; \bcite{rehbein1976auditory};
|
||||
\bcite{eichendorf1980projections}). The input to the network originates at the
|
||||
tympanal membrane, which acts as acoustic receiver and is coupled to the
|
||||
dendritic endings of the receptor neurons~(\bcite{gray1960fine}). The outputs
|
||||
from the network converge in the supraesophageal ganglion, which is presumed to
|
||||
harbor the neuronal substrate for conspecific song recognition and response
|
||||
from the network converge in the SEG, which presumably harbors the neuronal
|
||||
substrate for conspecific song recognition and response
|
||||
initiation~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
|
||||
\bcite{bhavsar2017brain}). Functionally, the ascending neurons are the most
|
||||
diverse of the three populations along the pathway. Individual ascending
|
||||
neurons possess highly specific response properties that contrast with the
|
||||
rather homogeneous response properties of the preceding receptor neurons and
|
||||
local interneurons~(\bcite{clemens2011efficient}), indicating a transition from
|
||||
a uniform population-wide processing stream into several parallel branches.
|
||||
Based on these anatomical and physiological considerations, the overall
|
||||
structure of the model pathway is divided into two distinct
|
||||
stages~(Fig.\,\ref{fig:pathway}d). The preprocessing stage incorporates the
|
||||
known physiological processing steps at the levels of the tympanal membrane,
|
||||
the receptor neurons, and the local interneurons; and operates on
|
||||
one-dimensional signal representations. The feature extraction stage
|
||||
corresponds to the processing within the ascending neurons and further
|
||||
downstream towards the supraesophageal ganglion; and operates on
|
||||
high-dimensional signal representations. The details of each physiological
|
||||
processing step and its functional approximation within the two stages are
|
||||
outlined in the following sections.
|
||||
\bcite{bhavsar2017brain}).
|
||||
|
||||
Functionally, the ascending neurons are the most diverse of the three neuronal
|
||||
populations. Individual ascending neurons possess highly specific response
|
||||
properties that contrast with the rather homogeneous response properties of the
|
||||
preceding receptor neurons and local
|
||||
interneurons~(\bcite{clemens2011efficient}), which indicates a transition from
|
||||
a uniform population-wide processing stream into several parallel branches.
|
||||
Accordingly, the model pathway is divided into two distinct
|
||||
stages~(Fig.\,\ref{fig:pathway}d): The preprocessing stage incorporates the
|
||||
processing steps at the levels of the tympanal membrane, the receptor neurons,
|
||||
and the local interneurons; and operates on one-dimensional signal
|
||||
representations~(Fig.\,\ref{fig:stages_pre}). The feature extraction stage
|
||||
corresponds to the processing within the ascending neurons and further
|
||||
downstream towards the SEG; and operates on high-dimensional signal
|
||||
representations~(Fig.\,\ref{fig:stages_feat}). The details of each
|
||||
physiological processing step and its functional approximation are described in
|
||||
the following sections.
|
||||
\begin{figure}[!ht]
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{figures/fig_auditory_pathway.pdf}
|
||||
@@ -389,53 +394,54 @@ outlined in the following sections.
|
||||
|
||||
\subsubsection{Population-driven signal preprocessing}
|
||||
|
||||
Grasshoppers receive airborne sound waves by a tympanal organ at either side of
|
||||
Grasshoppers receive airborne sound waves by a tympanal organ at each side of
|
||||
the body. The tympanal membrane acts as a mechanical resonance filter for
|
||||
sound-induced vibrations~(\bcite{windmill2008time}; \bcite{malkin2014energy}).
|
||||
Vibrations that fall within specific frequency bands are focused on different
|
||||
membrane areas, while others are attenuated. This processing step can be
|
||||
approximated by an initial bandpass filter
|
||||
approximated by an initial bandpass filter~(Fig.\,\ref{fig:stages_pre}a)
|
||||
applied to the acoustic input signal $\raw(t)$:
|
||||
\begin{equation}
|
||||
\filt(t)\,=\,\raw(t)\,*\,\bp, \qquad \fc\,=\,5\,\text{kHz},\,30\,\text{kHz}
|
||||
\label{eq:bandpass}
|
||||
\end{equation}
|
||||
applied to the acoustic input signal $\raw(t)$. The auditory receptor neurons
|
||||
transduce the vibrations of the tympanal membrane into sequences of action
|
||||
potentials. Thereby, they encode the amplitude modulation, or envelope, of the
|
||||
signal~(\bcite{machens2001discrimination}), which likely involves a rectifying
|
||||
nonlinearity~(\bcite{machens2001representation}). This can be modelled as
|
||||
full-wave rectification followed by lowpass filtering
|
||||
The receptor neurons transduce the vibrations of the tympanal membrane into
|
||||
sequences of action potentials. They thereby encode the amplitude modulation,
|
||||
or envelope, of the signal~(\bcite{machens2001discrimination}), which likely
|
||||
involves a rectifying nonlinearity~(\bcite{machens2001representation}). The
|
||||
extraction of the signal envelope~(Fig.\,\ref{fig:stages_pre}b) can be modelled
|
||||
as full-wave rectification followed by lowpass filtering of the tympanal signal
|
||||
$\filt(t)$:
|
||||
\begin{equation}
|
||||
\env(t)\,=\,|\filt(t)|\,*\,\lp, \qquad \fc\,=\,250\,\text{Hz}
|
||||
\label{eq:env}
|
||||
\end{equation}
|
||||
of the tympanal signal $\filt(t)$. Furthermore, the receptors exhibit a
|
||||
sigmoidal response curve over logarithmically compressed intensity
|
||||
levels~(\bcite{suga1960peripheral}; \bcite{gollisch2002energy}). In the model
|
||||
pathway, logarithmic compression is achieved by conversion to decibel scale
|
||||
Furthermore, the receptors exhibit a sigmoidal response curve over
|
||||
logarithmically compressed stimulus intensities~(\bcite{suga1960peripheral};
|
||||
\bcite{gollisch2002energy}). In the model pathway, logarithmic
|
||||
compression~(Fig.\,\ref{fig:stages_pre}c) is achieved by conversion to decibel
|
||||
scale
|
||||
\begin{equation}
|
||||
\db(t)\,=\,20\,\cdot\,\dec \frac{\env(t)}{\dbref}, \qquad \dbref\,=\,1
|
||||
\label{eq:log}
|
||||
\end{equation}
|
||||
relative to the common reference intensity $\dbref$.
|
||||
Both the receptor neurons~(\bcite{romer1976informationsverarbeitung};
|
||||
\bcite{gollisch2004input}; \bcite{fisch2012channel}) and, on a larger scale,
|
||||
the subsequent local interneurons~(\bcite{hildebrandt2009origin};
|
||||
\bcite{clemens2010intensity}) adapt their firing rates in response to sustained
|
||||
stimulus intensity levels, which allows for the robust encoding of faster
|
||||
amplitude modulations against a slowly changing overall baseline intensity.
|
||||
Functionally, the adaptation mechanism resembles a highpass filter
|
||||
relative to the common reference intensity $\dbref$. Both the receptor
|
||||
neurons~(\bcite{romer1976informationsverarbeitung}; \bcite{gollisch2004input};
|
||||
\bcite{fisch2012channel}) and, on a larger scale, the subsequent local
|
||||
interneurons~(\bcite{hildebrandt2009origin}; \bcite{clemens2010intensity})
|
||||
adapt their firing rates in response to sustained stimulus intensities, which
|
||||
allows for the robust encoding of faster amplitude modulations against a slowly
|
||||
changing overall baseline intensity. Functionally, the adaptation mechanism
|
||||
resembles a highpass filter~(Fig.\,\ref{fig:stages_pre}d) over the
|
||||
logarithmically compressed envelope $\db(t)$:
|
||||
\begin{equation}
|
||||
\adapt(t)\,=\,\db(t)\,*\,\hp, \qquad \fc\,=\,10\,\text{Hz}
|
||||
\label{eq:highpass}
|
||||
\end{equation}
|
||||
over the logarithmically scaled envelope $\db(t)$. This processing step
|
||||
concludes the preprocessing stage of the model pathway. The resulting
|
||||
intensity-adapted envelope $\adapt(t)$ is then passed on from the local
|
||||
interneurons to the ascending neurons, where it serves as the basis for the
|
||||
following feature extraction stage.
|
||||
|
||||
% Cite somewhere:
|
||||
This processing step concludes the preprocessing stage of the model pathway.
|
||||
The resulting intensity-adapted envelope $\adapt(t)$ is then passed on from the
|
||||
local interneurons to the ascending neurons, where it serves as the basis for
|
||||
the following feature extraction stage.
|
||||
\begin{figure}[!ht]
|
||||
\centering
|
||||
\includegraphics[width=\textwidth]{figures/fig_pre_stages.pdf}
|
||||
@@ -453,59 +459,71 @@ following feature extraction stage.
|
||||
\subsubsection{Feature extraction by individual neurons}
|
||||
|
||||
The ascending neurons extract and encode a number of different features of the
|
||||
preprocessed signal. As a population, they hence represent the signal in a
|
||||
higher-dimensional space than the preceding receptor neurons and local
|
||||
interneurons. Each ascending neuron is assumed to scan the signal for a
|
||||
specific template pattern, which can be thought of as a kernel of a particular
|
||||
structure and on a particular time scale. This process, known as template
|
||||
matching, can be modelled as a convolution
|
||||
preprocessed signal, and hence represent the signal in a higher-dimensional
|
||||
space than the preceding receptor neurons and local interneurons. Each
|
||||
ascending neuron is assumed to scan the signal for a specific template pattern,
|
||||
which can be thought of as a kernel of a particular structure and on a
|
||||
particular time scale. This process, known as template matching, can be
|
||||
modelled as a convolution of the intensity-adapted envelope $\adapt(t)$ with a
|
||||
kernel $k_i(t)$ specific to the $i$-th ascending neuron:
|
||||
\begin{equation}
|
||||
c_i(t)\,=\,\adapt(t)\,*\,k_i(t)
|
||||
= \infint \adapt(\tau)\,\cdot\,k_i(t\,-\,\tau)\,d\tau
|
||||
\label{eq:conv}
|
||||
\end{equation}
|
||||
of the intensity-adapted envelope $\adapt(t)$ with a kernel $k_i(t)$ per
|
||||
ascending neuron. We use Gabor kernels as basis functions for creating
|
||||
different template patterns. An arbitrary one-dimensional, real Gabor kernel is
|
||||
generated by multiplication of a Gaussian envelope and a sinusoidal carrier
|
||||
We use Gabor kernels as basis functions for creating different template
|
||||
patterns. An arbitrary one-dimensional, real Gabor kernel is generated by
|
||||
multiplication of a Gaussian envelope with standard deviation or kernel width
|
||||
$\kwi$ and a sinusoidal carrier with frequency $\kfi$ and phase $\kpi$:
|
||||
\begin{equation}
|
||||
k_i(t,\,\kwi,\,\kfi,\,\kpi)\,=\,e^{-\frac{t^{2}}{2{\kwi}^{2}}}\,\cdot\,\sin(\kfi\,t\,+\,\kpi), \qquad \kfi\,=\,2\pi\fsin
|
||||
k_i(t,\,\kwi,\,\kfi,\,\kpi)\,=\,e^{-\frac{t^{2}}{2{\kwi}^{2}}}\,\cdot\,\sin(\kfi\,t\,+\,\kpi), \qquad \kfi\,=\,2\pi f_{\text{sin}_i}
|
||||
\label{eq:gabor}
|
||||
\end{equation}
|
||||
with Gaussian standard deviation or kernel width $\kwi$, carrier frequency
|
||||
$\kfi$, and carrier phase $\kpi$. Different combinations of $\kw$ and $\kf$
|
||||
result in Gabor kernels with different lobe number $\kn$, which is the number
|
||||
of half-periods of the carrier that fit under the Gaussian envelope within
|
||||
reasonable limits of attenuation. The interval under the Gaussian envelope that
|
||||
contains the relevant lobes of the kernel can be defined as Gaussian full-width
|
||||
measured at relative peak height $\rh$
|
||||
Different combinations of $\kwi$ and $\kfi$ result in Gabor kernels with
|
||||
different lobe number $\kni$, which is the number of half-periods of the
|
||||
carrier that fit under the Gaussian envelope within reasonable limits of
|
||||
attenuation. The time window under the Gaussian envelope that contains the
|
||||
relevant lobes of the kernel can be defined as Gaussian full duration at height
|
||||
$\rh$ relative to the maximum of the Gaussian:
|
||||
\begin{equation}
|
||||
\fwrh(\kw,\,\rh)\,=\,2\,\cdot\,\sqrt{-2\,\cdot\,\ln \rh}\cdot\,\kw, \qquad \rh\,\in\,(0,\,1]
|
||||
\fdrm(\kwi,\,\rh)\,=\,2\,\cdot\,\sqrt{-2\,\cdot\,\ln \rh}\cdot\,\kwi, \qquad \rh\,\in\,(0,\,1]
|
||||
\label{eq:fdrm}
|
||||
\end{equation}
|
||||
% Yes, FDRM is a hideous acronym. Based on the common "full width at half
|
||||
% maximum" (FWHM) and adjusted because "full duration at half maximum" (FDHM)
|
||||
% is apparently preferred in a temporal context. Alternatively, "w_\text{gauss}"?
|
||||
With this, an appropriate carrier frequency $\kfi$ for obtaining a Gabor kernel
|
||||
with width $\kwi$ and desired lobe number $\kni$ can be approximated as
|
||||
\begin{equation}
|
||||
\kfi(\kni,\,\kwi,\,\rh)\,=\,\frac{0.5\,\cdot\,(\kni\,+\,\beta_0)}{\fdrm(\kwi,\,\rh)}, \qquad \kni\,\geq\,2\enspace\forall\enspace \kni\,\in\,\mathbb{Z}
|
||||
\label{eq:gabor_freq}
|
||||
\end{equation}
|
||||
With this, an appropriate carrier frequency $\kf$ for obtaining a Gabor kernel
|
||||
with width $\kw$ and desired lobe number $\kn$ can be approximated as
|
||||
% \begin{equation}
|
||||
% \kf(\kn,\,\fwrh)\,=\,\frac{0.5\,\cdot\,\kn\,+\,\off}{\fwrh}, \qquad \kn\,\geq\,2\enspace\forall\enspace \kn\,\in\,\mathbb{Z}
|
||||
% \kfi(\kni,\,\kwi,\,\rh)\,=\,\frac{0.5\,\cdot\,(\kni\,+\,\beta_0)}{2\,\cdot\,\sqrt{-2\,\cdot\,\ln \rh}\cdot\kwi}, \qquad \kni\,\geq\,2\enspace\forall\enspace \kni\,\in\,\mathbb{Z}
|
||||
% \end{equation}
|
||||
\begin{equation}
|
||||
\kf(\kn,\,\kw,\,\rh)\,=\,\frac{\kn\,+\,\off}{4\,\cdot\,\sqrt{-2\,\cdot\,\ln \rh}}, \qquad \kn\,\geq\,2\enspace\forall\enspace \kn\,\in\,\mathbb{Z}
|
||||
\end{equation}
|
||||
where $\off$ is a small positive offset to the near-linear relationship between
|
||||
$\kf$ and $\kn$ to balance the amplitude of the $\kn$ desired lobes of the
|
||||
kernel --- which should be maximized --- against the amplitude of the
|
||||
next-outer lobes, which should not exceed the threshold value determined by
|
||||
$\rh$. For $\kn=1$, carrier frequency $\kf$ is set to zero, which results in a
|
||||
simple Gaussian kernel. Carrier phase $\kp$ determines the position of the
|
||||
kernel lobes relative to the kernel center. By setting $\kp$ to one of only
|
||||
four specific phase values~(Tab.\,\ref{tab:gabor_phases}), we restrict the
|
||||
Gabor kernels to be either even functions~(mirror-symmetric, uneven $\kn$) or
|
||||
odd functions~(point-symmetric, even $\kn$) with either positive or negative
|
||||
sign, which refers to the sign of the kernel's central lobe (even kernels) or
|
||||
the left of the two central lobes (odd kernels).
|
||||
The relationship between $\kfi$ and $\kni$ is approximately linear except for
|
||||
small $\kni$. The offset term $\beta_0\approx0.5$ was added to balance the
|
||||
amplitudes of the $\kni$ desired lobes of the kernel --- which should be
|
||||
maximized --- against the amplitudes of the next-outer lobes, which should not
|
||||
exceed the threshold value determined by $\rh$. Note that simple Gaussian
|
||||
kernels with $\kni=1$ can be obtained by setting the carrier frequency to
|
||||
$\kfi=0$ and are hence not covered by Eq.\,\ref{eq:gabor_freq}.
|
||||
|
||||
Carrier phase $\kpi$ determines the position of the kernel lobes relative to
|
||||
the kernel center. We restrict the Gabor kernels to be either even or odd
|
||||
functions by setting $\kpi$ to one of only four specific phase
|
||||
values~(Tab.\,\ref{tab:gabor_phases}). Even Gabor kernels are mirror-symmetric
|
||||
with uneven $\kni$, whereas odd Gabor kernels are point-symmetric with even
|
||||
$\kni$. Both even and odd kernels can have either positive or negative sign,
|
||||
which refers to the sign of the kernel's central lobe (even kernels) or the
|
||||
left of the two central lobes (odd kernels). These four major groups of Gabor
|
||||
kernels allow for the extraction of different types of signal features, such as
|
||||
the presence of peaks (even, $+$), troughs (even, $-$), onsets (odd, $+$), and
|
||||
offsets (odd, $-$) at various time scales.
|
||||
\FloatBarrier
|
||||
\begin{table}[!ht]
|
||||
\centering
|
||||
\captionsetup{width=.46\textwidth}
|
||||
\captionsetup{width=.45\textwidth}
|
||||
\caption{Values of phase $\kp$ that are specific for the four major groups
|
||||
of Gabor kernels.}
|
||||
\begin{tabular}{|ccc|}
|
||||
@@ -519,13 +537,10 @@ the left of the two central lobes (odd kernels).
|
||||
\label{tab:gabor_phases}
|
||||
\end{table}
|
||||
\FloatBarrier
|
||||
These four major groups of Gabor kernels allow for the extraction of different
|
||||
types of signal features, such as the presence of peaks (even, $+$), troughs
|
||||
(even, $-$), onsets (odd, $+$), and offsets (odd, $-$) at various time scales.
|
||||
% Add kernel normalization here.
|
||||
Following the convolutional template matching, each kernel-specific response
|
||||
$c_i(t)$ is passed through a shifted Heaviside step-function $\nl$ with
|
||||
threshold value $\thr$ to obtain a binary response
|
||||
Following the convolutional template matching~(Fig.\,\ref{fig:stages_feat}a),
|
||||
each kernel-specific response $c_i(t)$ is passed through a shifted Heaviside
|
||||
step-function $\nl$ with threshold value $\thr$ to obtain a binary
|
||||
response~(Fig.\,\ref{fig:stages_feat}b):
|
||||
\begin{equation}
|
||||
b_i(t,\,\thr)\,=\,\begin{cases}
|
||||
\;1, \quad c_i(t)\,>\,\thr\\
|
||||
@@ -533,12 +548,12 @@ threshold value $\thr$ to obtain a binary response
|
||||
\end{cases}
|
||||
\label{eq:binary}
|
||||
\end{equation}
|
||||
which can be thought of as a categorization into "relevant" and "irrelevant"
|
||||
response values. In the grasshopper, these thresholding nonlinearities might
|
||||
either be part of the processing within the ascending neurons or take place
|
||||
further downstream~(SOURCE). Finally, the responses of the ascending neurons
|
||||
are assumed to be integrated somewhere in the supraesophageal
|
||||
ganglion~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
|
||||
The thresholding of $c_i(t)$ into $b_i(t)$ can be thought of as a
|
||||
categorization into "relevant" and "irrelevant" response values.
|
||||
% It is unclear whether such a thresholding nonlinearity is actually implemented
|
||||
% either by the ascending neurons or at some point further downstream in the SEG.
|
||||
Finally, the responses of the ascending neurons are assumed to be integrated
|
||||
somewhere in the SEG~(\bcite{ronacher1986routes}; \bcite{bauer1987separate};
|
||||
\bcite{bhavsar2017brain}). This processing step can be approximated as temporal
|
||||
averaging of the binary responses $b_i(t)$ by a lowpass filter
|
||||
\begin{equation}
|
||||
@@ -752,26 +767,23 @@ array was moved as close to the grasshopper as possible without interrupting
|
||||
its song production, which amounts to an approximate offset distance of 10\,cm
|
||||
between the animal and the leading microphone. Care was taken to maintain a
|
||||
stable position and height of the microphone array during recording. The
|
||||
resulting recordings were then processed through the model pathway and analysed
|
||||
resulting recordings were then processed through the model pathway and analyzed
|
||||
according to the procedure described in Section~\ref{sec:intensity_measures}.
|
||||
|
||||
\section{Results}
|
||||
|
||||
\subsection{Mechanisms driving the emergence of intensity invariance}
|
||||
|
||||
% Still missing the SNR analysis. Should be able to write around it for now.
|
||||
The robustness of song recognition is tied to the degree of intensity
|
||||
invariance of the finalized feature representation. Ideally, the values of each
|
||||
feature should depend only on the relative amplitude dynamics of the song
|
||||
pattern but not on the overall intensity of the song. In the grasshopper, the
|
||||
emergence of intensity-invariant representations along the song recognition
|
||||
pathway likely is a distributed process that involves different neuronal
|
||||
populations, which raises the question of what the essential computational
|
||||
mechanisms are that drive this process. Within the model pathway, we identified
|
||||
two key mechanisms that render the song representation more invariant to
|
||||
intensity variations. The two mechanisms each comprise a nonlinear signal
|
||||
transformation followed by a linear signal transformation but differ in the
|
||||
specific operations involved, as outlined in the following sections.
|
||||
It is not necessary to test each processing step along the model pathway for
|
||||
intensity invariance. Instead, we can focus on those steps that involve
|
||||
nonlinear transformations, since these are the only steps that can potentially
|
||||
change the dependency on scale $\sca$ between the input and output
|
||||
representations. Overall, there are three nonlinear transformations along the
|
||||
model pathway: Full-wave rectification during envelope extraction, logarithmic
|
||||
compression, and the thresholding nonlinearity during feature extraction. In
|
||||
the following, we analyze the effects of each of these transformations on the
|
||||
intensity and SNR of the resulting representations as well as their potential
|
||||
contribution to intensity invariance.
|
||||
|
||||
\subsubsection{Full-wave rectification \& lowpass filtering}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user