[regression] n-dim is simply n times 1 dim
This commit is contained in:
parent
60a94c9ce6
commit
5a6cca59d3
@ -259,6 +259,7 @@ There is no need to calculate this derivative analytically, because it
|
|||||||
can be approximated numerically by the difference quotient
|
can be approximated numerically by the difference quotient
|
||||||
(Box~\ref{differentialquotientbox}) for small steps $\Delta c$:
|
(Box~\ref{differentialquotientbox}) for small steps $\Delta c$:
|
||||||
\begin{equation}
|
\begin{equation}
|
||||||
|
\label{costderivativediff}
|
||||||
\frac{{\rm d} f_{cost}(c)}{{\rm d} c} =
|
\frac{{\rm d} f_{cost}(c)}{{\rm d} c} =
|
||||||
\lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
\lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
||||||
\approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
\approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
|
||||||
@ -443,7 +444,7 @@ are searching for the position of the bottom of the deepest valley
|
|||||||
and
|
and
|
||||||
\[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \]
|
\[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \]
|
||||||
one can calculate the slopes in the directions of each of the
|
one can calculate the slopes in the directions of each of the
|
||||||
variables by means of the respective difference quotient
|
variables by means of the respective difference quotients
|
||||||
(see box~\ref{differentialquotientbox}). \vspace{1ex}
|
(see box~\ref{differentialquotientbox}). \vspace{1ex}
|
||||||
|
|
||||||
\begin{minipage}[t]{0.5\textwidth}
|
\begin{minipage}[t]{0.5\textwidth}
|
||||||
@ -489,6 +490,10 @@ $p_j$ the respective partial derivatives as coordinates:
|
|||||||
\label{gradient}
|
\label{gradient}
|
||||||
\nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right)
|
\nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right)
|
||||||
\end{equation}
|
\end{equation}
|
||||||
|
Despite the fancy words this simply means that we need to calculate the
|
||||||
|
derivatives in the very same way as we have done it for the case of a
|
||||||
|
single parameter, \eqnref{costderivativediff}, for each parameter
|
||||||
|
separately.
|
||||||
|
|
||||||
The iterative equation \eqref{gradientdescent} of the gradient descent
|
The iterative equation \eqref{gradientdescent} of the gradient descent
|
||||||
stays exactly the same, with the only difference that the current
|
stays exactly the same, with the only difference that the current
|
||||||
@ -497,7 +502,8 @@ parameter value $p_i$ becomes a vector $\vec p_i$ of parameter values:
|
|||||||
\label{ndimgradientdescent}
|
\label{ndimgradientdescent}
|
||||||
\vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i)
|
\vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i)
|
||||||
\end{equation}
|
\end{equation}
|
||||||
The algorithm proceeds along the negative gradient
|
For each parameter we subtract the corresponding derivative multiplied
|
||||||
|
with $\epsilon$. The algorithm proceeds along the negative gradient
|
||||||
(\figref{powergradientdescentfig}).
|
(\figref{powergradientdescentfig}).
|
||||||
|
|
||||||
For the termination condition we need the length of the gradient. In
|
For the termination condition we need the length of the gradient. In
|
||||||
|
Reference in New Issue
Block a user