diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex index 83060fd..2d13b96 100644 --- a/regression/lecture/regression.tex +++ b/regression/lecture/regression.tex @@ -259,6 +259,7 @@ There is no need to calculate this derivative analytically, because it can be approximated numerically by the difference quotient (Box~\ref{differentialquotientbox}) for small steps $\Delta c$: \begin{equation} + \label{costderivativediff} \frac{{\rm d} f_{cost}(c)}{{\rm d} c} = \lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c} \approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c} @@ -443,7 +444,7 @@ are searching for the position of the bottom of the deepest valley and \[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \] one can calculate the slopes in the directions of each of the - variables by means of the respective difference quotient + variables by means of the respective difference quotients (see box~\ref{differentialquotientbox}). \vspace{1ex} \begin{minipage}[t]{0.5\textwidth} @@ -489,6 +490,10 @@ $p_j$ the respective partial derivatives as coordinates: \label{gradient} \nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right) \end{equation} +Despite the fancy words this simply means that we need to calculate the +derivatives in the very same way as we have done it for the case of a +single parameter, \eqnref{costderivativediff}, for each parameter +separately. The iterative equation \eqref{gradientdescent} of the gradient descent stays exactly the same, with the only difference that the current @@ -497,7 +502,8 @@ parameter value $p_i$ becomes a vector $\vec p_i$ of parameter values: \label{ndimgradientdescent} \vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i) \end{equation} -The algorithm proceeds along the negative gradient +For each parameter we subtract the corresponding derivative multiplied +with $\epsilon$. The algorithm proceeds along the negative gradient (\figref{powergradientdescentfig}). For the termination condition we need the length of the gradient. In