[regression] n-dim is simply n times 1 dim

This commit is contained in:
Jan Benda 2020-12-23 18:16:27 +01:00
parent 60a94c9ce6
commit 5a6cca59d3

View File

@ -259,6 +259,7 @@ There is no need to calculate this derivative analytically, because it
can be approximated numerically by the difference quotient
(Box~\ref{differentialquotientbox}) for small steps $\Delta c$:
\begin{equation}
\label{costderivativediff}
\frac{{\rm d} f_{cost}(c)}{{\rm d} c} =
\lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
\approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
@ -443,7 +444,7 @@ are searching for the position of the bottom of the deepest valley
and
\[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \]
one can calculate the slopes in the directions of each of the
variables by means of the respective difference quotient
variables by means of the respective difference quotients
(see box~\ref{differentialquotientbox}). \vspace{1ex}
\begin{minipage}[t]{0.5\textwidth}
@ -489,6 +490,10 @@ $p_j$ the respective partial derivatives as coordinates:
\label{gradient}
\nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right)
\end{equation}
Despite the fancy words this simply means that we need to calculate the
derivatives in the very same way as we have done it for the case of a
single parameter, \eqnref{costderivativediff}, for each parameter
separately.
The iterative equation \eqref{gradientdescent} of the gradient descent
stays exactly the same, with the only difference that the current
@ -497,7 +502,8 @@ parameter value $p_i$ becomes a vector $\vec p_i$ of parameter values:
\label{ndimgradientdescent}
\vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i)
\end{equation}
The algorithm proceeds along the negative gradient
For each parameter we subtract the corresponding derivative multiplied
with $\epsilon$. The algorithm proceeds along the negative gradient
(\figref{powergradientdescentfig}).
For the termination condition we need the length of the gradient. In