[regression] n-dim is simply n times 1 dim
This commit is contained in:
parent
60a94c9ce6
commit
5a6cca59d3
@ -259,6 +259,7 @@ There is no need to calculate this derivative analytically, because it
|
||||
can be approximated numerically by the difference quotient
|
||||
(Box~\ref{differentialquotientbox}) for small steps $\Delta c$:
|
||||
\begin{equation}
|
||||
\label{costderivativediff}
|
||||
\frac{{\rm d} f_{cost}(c)}{{\rm d} c} =
|
||||
\lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
||||
\approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
||||
@ -443,7 +444,7 @@ are searching for the position of the bottom of the deepest valley
|
||||
and
|
||||
\[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \]
|
||||
one can calculate the slopes in the directions of each of the
|
||||
variables by means of the respective difference quotient
|
||||
variables by means of the respective difference quotients
|
||||
(see box~\ref{differentialquotientbox}). \vspace{1ex}
|
||||
|
||||
\begin{minipage}[t]{0.5\textwidth}
|
||||
@ -489,6 +490,10 @@ $p_j$ the respective partial derivatives as coordinates:
|
||||
\label{gradient}
|
||||
\nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right)
|
||||
\end{equation}
|
||||
Despite the fancy words this simply means that we need to calculate the
|
||||
derivatives in the very same way as we have done it for the case of a
|
||||
single parameter, \eqnref{costderivativediff}, for each parameter
|
||||
separately.
|
||||
|
||||
The iterative equation \eqref{gradientdescent} of the gradient descent
|
||||
stays exactly the same, with the only difference that the current
|
||||
@ -497,7 +502,8 @@ parameter value $p_i$ becomes a vector $\vec p_i$ of parameter values:
|
||||
\label{ndimgradientdescent}
|
||||
\vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i)
|
||||
\end{equation}
|
||||
The algorithm proceeds along the negative gradient
|
||||
For each parameter we subtract the corresponding derivative multiplied
|
||||
with $\epsilon$. The algorithm proceeds along the negative gradient
|
||||
(\figref{powergradientdescentfig}).
|
||||
|
||||
For the termination condition we need the length of the gradient. In
|
||||
|
Reference in New Issue
Block a user