updated regression exercise.

This commit is contained in:
Jan Benda 2018-01-09 12:49:54 +01:00
parent 9c0559ab03
commit 207bcbdcb9

View File

@ -2,6 +2,7 @@
\usepackage[german]{babel} \usepackage[german]{babel}
\usepackage{natbib} \usepackage{natbib}
\usepackage{xcolor}
\usepackage{graphicx} \usepackage{graphicx}
\usepackage[small]{caption} \usepackage[small]{caption}
\usepackage{sidecap} \usepackage{sidecap}
@ -61,28 +62,30 @@
of a straigth line that we want to fit to the data in the file of a straigth line that we want to fit to the data in the file
\emph{lin\_regression.mat}. \emph{lin\_regression.mat}.
In the lecture we already prepared the necessary functions: 1. the In the lecture we already prepared most of the necessary functions:
error function (\code{meanSquareError()}), 2. the cost function 1. the error function (\code{meanSquareError()}), 2. the cost
(\code{lsqError()}), and 3. the gradient (\code{lsqGradient()}). function (\code{lsqError()}), and 3. the gradient
(\code{lsqGradient()}). Read chapter 8 ``Optimization and gradient
descent'' in the script, in particular section 8.4 and exercise 8.4!
The algorithm for the descent towards the minimum of the cost The algorithm for the descent towards the minimum of the cost
function is as follows: function is as follows:
\begin{enumerate} \begin{enumerate}
\item Start with some arbitrary parameter values $p_0 = (m_0, b_0)$ \item Start with some arbitrary parameter values $\vec p_0 = (m_0, b_0)$
for the slope and the intercept of the straight line. for the slope and the intercept of the straight line.
\item \label{computegradient} Compute the gradient of the cost function \item \label{computegradient} Compute the gradient of the cost function
at the current values of the parameters $p_i$. at the current values of the parameters $\vec p_i$.
\item If the magnitude (length) of the gradient is smaller than some \item If the magnitude (length) of the gradient is smaller than some
small number, the algorithm converged close to the minimum of the small number, the algorithm converged close to the minimum of the
cost function and we abort the descent. Right at the minimum the cost function and we abort the descent. Right at the minimum the
magnitude of the gradient is zero. However, since we determine magnitude of the gradient is zero. However, since we determine
the gradient numerically, it will never be exactly zero. This is the gradient numerically, it will never be exactly zero. This is
why we require the gradient to be sufficiently small why we just require the gradient to be sufficiently small
(e.g. \code{norm(gradient) < 0.1}). (e.g. \code{norm(gradient) < 0.1}).
\item \label{gradientstep} Move against the gradient by a small step \item \label{gradientstep} Move against the gradient by a small step
($\epsilon = 0.01$): ($\epsilon = 0.01$):
\[p_{i+1} = p_i - \epsilon \cdot \nabla f_{cost}(m_i, b_i)\] \[\vec p_{i+1} = \vec p_i - \epsilon \cdot \nabla f_{cost}(m_i, b_i)\]
\item Repeat steps \ref{computegradient} -- \ref{gradientstep}. \item Repeat steps \ref{computegradient} -- \ref{gradientstep}.
\end{enumerate} \end{enumerate}