diff --git a/regression/lecture/regression.tex b/regression/lecture/regression.tex
index 83060fd..2d13b96 100644
--- a/regression/lecture/regression.tex
+++ b/regression/lecture/regression.tex
@@ -259,6 +259,7 @@ There is no need to calculate this derivative analytically, because it
 can be approximated numerically by the difference quotient
 (Box~\ref{differentialquotientbox}) for small steps $\Delta c$:
 \begin{equation}
+  \label{costderivativediff}
   \frac{{\rm d} f_{cost}(c)}{{\rm d} c} =
     \lim\limits_{\Delta c \to 0} \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
     \approx \frac{f_{cost}(c + \Delta c) - f_{cost}(c)}{\Delta c}
@@ -443,7 +444,7 @@ are searching for the position of the bottom of the deepest valley
   and
   \[ \frac{\partial f(x,y)}{\partial y} = \lim\limits_{\Delta y \to 0} \frac{f(x, y + \Delta y) - f(x,y)}{\Delta y} \]
   one can calculate the slopes in the directions of each of the
-  variables by means of the respective difference quotient
+  variables by means of the respective difference quotients
   (see box~\ref{differentialquotientbox}).  \vspace{1ex}
 
   \begin{minipage}[t]{0.5\textwidth}
@@ -489,6 +490,10 @@ $p_j$ the respective partial derivatives as coordinates:
   \label{gradient}
   \nabla f_{cost}(\vec p) = \left( \frac{\partial f_{cost}(\vec p)}{\partial p_j} \right)
 \end{equation}
+Despite the fancy words this simply means that we need to calculate the
+derivatives in the very same way as we have done it for the case of a
+single parameter, \eqnref{costderivativediff}, for each parameter
+separately.
 
 The iterative equation \eqref{gradientdescent} of the gradient descent
 stays exactly the same, with the only difference that the current
@@ -497,7 +502,8 @@ parameter value $p_i$ becomes a vector $\vec p_i$ of parameter values:
   \label{ndimgradientdescent}
   \vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(\vec p_i)
 \end{equation}
-The algorithm proceeds along the negative gradient
+For each parameter we subtract the corresponding derivative multiplied
+with $\epsilon$.  The algorithm proceeds along the negative gradient
 (\figref{powergradientdescentfig}).
 
 For the termination condition we need the length of the gradient. In