Good progress with cleaning up "IntInv vs SNR". Gonna merge with "IntInv vs IntInv" tomorrow.
This commit is contained in:
73
main.tex
73
main.tex
@@ -1719,18 +1719,7 @@ degree of temporal integration~(Section\,\ref{sec:constant_feat}).
|
||||
|
||||
\subsection{Intensity invariance versus SNR along the model pathway}
|
||||
|
||||
% % Establishing the principle trade-off (should maybe come later?):
|
||||
% The output of a transformation is considered to be intensity-invariant if its
|
||||
% intensity measure saturates for sufficiently large scales $\sca$, which in turn
|
||||
% caps the output SNR to a constant value across these $\sca$. Otherwise, the
|
||||
% output SNR will increase monotonically with $\sca$. The trade-off between
|
||||
% intensity invariance and SNR refers to the principle that a transformation can
|
||||
% either improve intensity invariance or maintain SNR --- it cannot do both at
|
||||
% the same time. This principle is most likely not specific to the two mechanisms
|
||||
% along the model pathway but rather a general property of transformations that
|
||||
% equalize between different input intensities.
|
||||
|
||||
% Building a sufficient SNR "buffer":
|
||||
% Building a sufficiently large SNR "buffer":
|
||||
A stridulating grasshopper generates a song with a specific initial intensity,
|
||||
which is steadily attenuated as the song propagates through the
|
||||
environment~(\bcite{michelsen1978sound}). A listening grasshopper receives a
|
||||
@@ -1742,12 +1731,12 @@ filtering of $\raw(t)$ into $\filt(t)$ likely improves the SNR by attenuating
|
||||
frequencies outside the relevant range of grasshopper songs. The SNR is further
|
||||
improved by the rectification and lowpass filtering of $\filt(t)$ into
|
||||
$\env(t)$. The lower the cutoff frequency $\fc$ of the lowpass filter, the
|
||||
higher the SNR of $\env(t)$ at a given $\sca$, although $\fc$ must also be
|
||||
sufficiently high to preserve the amplitude dynamics of the song pattern.
|
||||
Overall, the first processing steps along the pathway are not designed to
|
||||
achieve intensity invariance but rather to improve the SNR of the song
|
||||
representation beyond the initial SNR of $\raw(t)$.
|
||||
higher the SNR of $\env(t)$ for a given $\sca$, although $\fc$ must also be
|
||||
sufficiently high to preserve the amplitude dynamics of the song pattern. The
|
||||
first processing steps along the pathway are hence designed to improve the SNR
|
||||
of the song representation beyond the initial SNR of $\raw(t)$.
|
||||
|
||||
% Dependence of log-HP intensity invariance on sufficient SNR (+implications):
|
||||
The first mechanism of intensity invariance consists of logarithmic compression
|
||||
and adaptation of $\env(t)$ into $\adapt(t)$. In the absence of $\noc(t)$,
|
||||
$\adapt(t)$ is a perfectly intensity-invariant representation of $\soc(t)$. In
|
||||
@@ -1757,28 +1746,36 @@ $\raw(t)$ to $\env(t)$ thus serve to improve the intensity invariance of
|
||||
$\adapt(t)$ by shifting the saturation point towards lower $\sca$. However,
|
||||
this effect is limited --- if the SNR of $\raw(t)$ at the receiver's position
|
||||
does not allow for a sufficiently high SNR of $\env(t)$, $\adapt(t)$ will not
|
||||
be intensity-invariant. The initial song intensity that the sender can achieve
|
||||
therefore determines the distance at which $\adapt(t)$ is intensity-invariant
|
||||
to the receiver.
|
||||
be intensity-invariant. In this case, the receiver is presumably less likely to
|
||||
recognize $\raw(t)$ as a conspecific song. The limitation of the intensity
|
||||
invariance of $\adapt(t)$ by the SNR of $\raw(t)$ might hence at least in parts
|
||||
be responsible for the limited maximum distance at which song recognition is
|
||||
possible~(\bcite{lang2000acoustic}) and the selection towards song patterns
|
||||
that are robust to noise masking~(\bcite{einhaupl2011attractiveness}).
|
||||
|
||||
Assuming that intensity invariance of $\adapt(t)$ is required for reliable song
|
||||
recognition,
|
||||
|
||||
|
||||
This might be a reason why robustness to noise masking is an
|
||||
attractive property of male calling songs~(\bcite{einhaupl2011attractiveness}).
|
||||
|
||||
The saturation level of $\adapt$,
|
||||
unlike its saturation point, is independent of the SNR of $\env(t)$ because the
|
||||
influence of $\noc(t)$ is negligible for sufficiently large $\sca$. The output
|
||||
SNR of $\adapt(t)$ saturates at a comparably low value of around 10. This might
|
||||
in parts be a consequence of the logarithm, which compresses different higher
|
||||
intensities but also amplifies lower intensities, including the noise floor.
|
||||
Both the saturation level and the saturation point of $\adapt(t)$ vary between
|
||||
different species and individual songs. These differences are likely rooted in
|
||||
the way in which logarithmic compression acts on the specific distribution of
|
||||
$\env(t)$, which is determined by $\fc$ as well as the temporal structure and
|
||||
frequency spectrum of the rectified $\filt(t)$.
|
||||
% Trading SNR for log-HP intensity invariance (+variability, +general principle):
|
||||
The SNR of each song representation prior to $\adapt(t)$ increases
|
||||
monotonically with $\sca$~(excluding $0<\sca\ll1$, noise regime). These
|
||||
representations maintain and improve the initial SNR of $\raw(t)$ and hence
|
||||
never achieve intensity invariance. In contrast, the SNR of the
|
||||
intensity-invariant $\adapt(t)$ never exceeds its saturation level even for
|
||||
arbitrarily high $\sca$. The saturation level of $\adapt(t)$ varies across
|
||||
species and songs. This variability is likely rooted in the way in which
|
||||
logarithmic compression acts on the specific distribution of $\env(t)$, which
|
||||
depends on the $\fc$ of the lowpass filter as well as the temporal structure
|
||||
and frequency spectrum of the rectified $\filt(t)$. Overall, $\adapt(t)$ has
|
||||
never been observed to exceed a SNR of around~10 across all songs. The low SNR
|
||||
of $\adapt(t)$ partially results from the amplification of smaller values of
|
||||
$\env(t)$ by the logarithm, which raises the noise floor of $\adapt(t)$. Still,
|
||||
the reduction in SNR is substantial --- considering that the SNR of preceeding
|
||||
song representations has been orders of magnitude higher --- but is likely a
|
||||
necessary price to pay for the intensity invariance of $\adapt(t)$. After all,
|
||||
a transformation cannot compress a range of different input intensities into a
|
||||
constant output intensity without sacrificing some of the corresponding input
|
||||
SNR. Accordingly, the trade-off between intensity invariance and SNR is not
|
||||
expected to be specific to the particular mechanisms along the pathway but
|
||||
presumably applies to any transformation that achieves or improves intensity
|
||||
invariance.
|
||||
|
||||
Thresholding and temporal averaging renders feature $f_i(t)$
|
||||
intensity-invariant for sufficiently large $\sca$. The trade-off between
|
||||
|
||||
Reference in New Issue
Block a user