Perfect Code: Running Least Squares

or "Incremental Least Squares"

If you want to find the Least Squares solution to a linear algebra problem ($\textbf{A} \vec{y} = \vec{b}$), you probably would use the formula

$$\hat{y} = (\textbf{A}^\intercal \textbf{A})^{-1} \textbf{A}^\intercal \vec{b},$$

where $\hat{y}$ is the closest possible value to $y$. If you need to do this for many points, the operation could become expensive, both in time and memory. What you instead could do is this:

Imagine you have calculated the approximation for some number of samples. You had $\textbf{A}$, $\vec{b}$ and $\vec{y}$, and calculated $\hat{y}$.

Now you want to add another point to the calculations. Instead of recalculating the transpose, inverse and multiplications, you can add one more row to the formula.

$$\hat{y}' = \left( \begin{bmatrix}\textbf{A} \\ \vec{r}^\intercal\end{bmatrix}^\intercal \begin{bmatrix}\textbf{A} \\ \vec{r}^\intercal\end{bmatrix} \right)^{-1} \begin{bmatrix}\textbf{A} \\ \vec{r}^\intercal\end{bmatrix}^\intercal \begin{bmatrix}\vec{b} \\ z\end{bmatrix}$$ which expands into $$\hat{y}' = \left( \textbf{A}^\intercal \textbf{A} + \vec{r} \vec{r}^\intercal \right)^{-1} \left( \textbf{A}^\intercal \vec{b} + \vec{r} z \right)$$

Here $\vec{r}$ is a vector of the input variables (e.g. $\begin{bmatrix}x \\ y\end {bmatrix}$), and $z$ is the new target.

You might now see where this is going.

All you need to keep in memory, are the matrix $\textbf{M} = \textbf{A}^\intercal \textbf{A}$ and vector $ \vec{v} = \textbf{A}^\intercal \vec{b} $.
Whenever you want to add another point, you increment $\textbf{M}$ with $\vec{r} \vec{r}^\intercal$, and $\vec{v}$ with $\vec{r} z$.
When you want to get the result, you calculate $\hat{y} = \textbf{M}^{-1} \vec{v}$. This operation will require at least two samples.
$\textbf{M}$ and $\vec{v}$ both start off with only zero-entries.

The size of $\textbf{M}$ is $n \times n$, and $\vec{v}$ is $n$, where $n$ is the number of coefficients in the system (the width of the matrix $\textbf{A}$.

5 comments:

Unknown said...: Hi, you did not consider that the direct computation of the inverse of (A^TA)^{-1} is very expensive. Usually QR factorization is used to reduce the computational cost. If QR is used, how to incrementally compute the least square solution?; 19 July, 2012 19:39
Markus Jarderot said...: A^TA is just a N x N matrix, where N is the number of variables. Usually, the number of variables are small.

If you are using a large number of variables (or if you consider 4 x 4 large), I don't think there is any easy way to a "running" QR factorization.; 19 July, 2012 20:08
Unknown said...: I'm surprised you didn't invoke Sherman-Morrison to get your incremental M^-1; 16 July, 2015 23:03
Anonymous said...: This comment has been removed by a blog administrator.; 29 March, 2022 09:18
NonPurpleStar said...: This comment has been removed by the author.; 13 May, 2022 16:56

Tuesday, May 22, 2012

Running Least Squares

5 comments: