A Faster Algorithm for Repeated Linear Regression
Posted on 0 Comments
Repeated Linear Regression means that repeat the fitting of linear regression for many times, and there are some common parts among these regressions.
Background
Assume that we have a set of independent variables, $x_1, x_2,\ldots, x_n$, and a predictive variable $y$. In addition to these variables, there are also 4 variables called covariates, $cov_1, cov_2, cov_3, cov_4$. Now we want to fit the following linear regressions,
\[\begin{array}{ll} y&\sim x_1+x_2+x_1x_2 + cov_1 + cov_2+cov_3+cov_4\\ y&\sim x_1+x_3+x_1x_3 + cov_1 +cov_2+cov_3+cov_4\\ \cdot &\sim \cdots\\ y&\sim x_1+x_n+x_1x_n + cov_1 +cov_2+cov_3+cov_4\\ \end{array}\]Naive Method
Denote $X_{1i}=(1, x_1,x_i, x_1x_i, cov_1,cov_2,cov_3,cov_4), i=2,\ldots,n$. Repeat the linear regression for $n-1$ times,
\[\hat\beta_{1i}=(X_{1i}'X_{1i})^{-1}X_{1i}'Y\]The main complexity of the above estimation is the matrix inverse, whose complexity is $O(N^3)$. By using that method, we need repeat the $8\times 8$ matrix inverse for $n-1$ times.
Improved Method
Woodbury formula:
\[(A+UCV)^{-1}=A^{-1}-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}VA^{-1}\]
Denote
\[X_{1i}=(x_i, x_1x_i, 1, x_1, cov_1,cov_2,cov_3,cov_4) = (\mathbf a_{1i}, \mathbf b)\]where $\mathbf a_{1i}=(x_i,x_1x_i), \mathbf b=(1, x_1,cov_1,cov_2,cov_3,cov_4)$. Notice that we only adjust the order of $X_{1i}$, compared to the naive method, it does not affect the estimation of $\beta_{1i}$ (just different order of components.)
Then we have
\[X_{1i}'X_{1i}=\left[ \begin{array}{ll} \mathbf a_{1i}'\mathbf a_{1i} & \mathbf a_{1i}'\mathbf b\\ \mathbf b'\mathbf a_{1i} &\mathbf{b'b} \end{array} \right]= \left[ \begin{array}{ll} A_{1i}& V_{1i}'\\ V_{1i}& B \end{array} \right]\]Let
\[A=\left[ \begin{array}{ll} A_{1i}&O\\ O&B \end{array}\right],\; U=\left[ \begin{array}{ll} I_{2\times 2} & O_{2\times 2}\\ O_{6\times 2} & V_{1i} \end{array} \right],\; V= \left[ \begin{array}{ll} O_{2\times 2}& V_{1i}'\\ I_{2\times 2}& O_{2\times 6} \end{array} \right]; \; i=2,\ldots, n\]and $C=I_{4\times 4}$. Then we can apply woodbury formula
\[(X_{1i}'X_{1i})^{-1}=(A+UCV)^{-1}=A^{-1}-A^{-1}U(C^{-1}+VA^{-1}U)^{-1}VA^{-1}\]where
\[A^{-1}=\left[ \begin{array}{ll} A_{1i}^{-1}&O\\ O&B^{-1} \end{array} \right]\]We can do further calculation to simplify the inverse.
\[A^{-1}U = \left[ \begin{array}{ll} A_{1i}^{-1}&O\\ O&B^{-1}V_{1i} \end{array} \right]\] \[VA^{-1} = \left[ \begin{array}{cc} O&V_{1i}'B^{-1}\\ A_{1i}^{-1}&O \end{array} \right]\] \[VA^{-1}U= \left[ \begin{array}{cc} O&V_{1i}'B^{-1}V_{1i}\\ A_{1i}^{-1}&O \end{array} \right]\] \[\begin{array}{ll} (C^{-1}+VAU)^{-1}&\\=\left[ \begin{array}{cc} I&V_{1i}'B^{-1}V_{1i}\\ A_{1i}^{-1} &I\\ \end{array} \right]^{-1}&\\ =\left[ \begin{array}{cc} (I-V_{1i}'B^{-1}V_{1i}A_{1i}^{-1})^{-1}&-V_{1i}'B^{-1}V_{1i}(I-A_{1i}^{-1}V_{1i}'B^{-1}V_{1i})^{-1}\\ -A_{1i}^{-1}(I-V_{1i}'B^{-1}V_{1i}A_{1i}^{-1})^{-1}& (I-A_{1i}^{-1}V_{1i}'B^{-1}V_{1i})^{-1} \end{array} \right]&\\ =\left[ \begin{array}{cc} D & -B_1D'\\ -A_{1i}^{-1}D&D' \end{array} \right] \end{array}\]where $B_1 = V_{1i}’B^{-1}V_{1i}$, $D=(I-B_1A_{1i}^{-1})^{-1}$
So
\[(X_{1i}'X_{1i})^{-1}\\=\left[ \begin{array}{cc} A_{1i}^{-1}+A_{1i}^{-1}B_1D'A_{1i}^{-1}&-A_{1i}^{-1}DV_{1i}'B^{-1}\\ -B^{-1}V_{1i}D'A_{1i}^{-1}&B^{-1}+B^{-1}V_{1i}A_{1i}^{-1}DV_{1i}'B^{-1} \end{array} \right]\]Notice that $B$ is the same for these regressions, so we just need to calculate its inverse for the first time. And $A_{1i}$ is just $2\times 2$, it is much simpler. Then we just need to calculate the inverse of $2\times 2$ matrix, which is much smaller than $8\times 8$.
So I can use this process to reduce the dimension of matrix inverse significantly, which would speed the program theoretically, although need to calculate more matrix multiplication at the same time.
Continue to calculate $X_{1i}’Y$ to finish the estimation of $\beta$.
\[X_{1i}'Y = (\mathbf a_{1i}, \mathbf b)' Y = \left[\begin{array}{c}\mathbf a_{1i}'Y\\\mathbf b'Y\end{array}\right]\]Then
\[\hat\beta_{1i} =(X_{1i}'X_{1i})^{-1}X_{1i}'Y\]Practically Simulation
Apply these algorithm to the project epi
, and compare these two method. It is obvious that the reduced-dimension method can speed the program significantly.
size_i | size_j | method | parallel | time(s) |
---|---|---|---|---|
50 | 5000 | reduced | 4 | 2.813 |
50 | 5000 | reduced | 1 | 9.440 |
50 | 5000 | naive | 4 | 14.729 |
50 | 5000 | naive | 1 | 53.507 |
500 | 5000 | reduced | 4 | 25.250 |
500 | 5000 | reduced | 1 | 83.836 |
500 | 5000 | naive | 4 | 141.837 |
500 | 5000 | naive | 1 | 493.661 |
5000 | 5000 | reduced | 4 | 128.087 |
5000 | 5000 | reduced | 1 | 438.277 |
5000 | 5000 | naive | 4 | * |
5000 | 5000 | naive | 1 | * |
The above cases are tested on my laptop, whose processor is “Intel® Core™ i5-6300HQ CPU @ 2.30GHz × 4”, and the actual memory is “7.7GB”. Now run these two programs on a server, whose processor is “20 Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz”.
size_i | size_j | method | parallel | time(s) |
---|---|---|---|---|
5000 | 5000 | reduced | 20 | 22.622 |
5000 | 5000 | naive | 20 | 75.422 |
Reference
- https://en.wikipedia.org/wiki/Woodbury_matrix_identity