A singular value decomposition (SVD) of a real matrix A is its
factorization into the product of three matrices :
![]() |
(1) |
There are several ways of computing the SVD. With serial processing, variants of the QR algorithm (used to compute the eigenvalues of a matrix) can be used to compute the SVD. The basic approach involves the following (Golub and Van Loan, 1989) :
With some manipulations, it can be shown that R is diagonal, so we get : and
, completing the SVD computation.
The drawback with this approach is that the formation of ATA can lead to a loss of information. A preferable method for computing the SVD is described in Golub and Kahan (1965). The symmetric QR algorithm is implicitly applied to ATA (in other words, the product ATA is never calculated) in order to find U and V simultaneously.
The approaches described above are very effective on serial processors, but they are not readily amenable to parallel implementations. With parallel processing, the preferred methods are variations of the Jacobi algorithm used to compute the eigenvalues of a symmetric matrix. The classical Jacobi algorithm (Jacobi, 1846) involves pre-multiplication and post-multiplication of A by rotations of the form (Jacobi rotations) :
A parallel implementation using mesh connected processors is described in Brent, et al. (1985). In order to compute the SVD for an matrix, O(n2) processors are needed. This requirement seriously limits the size of the matrices that can be decomposed on the Connection Machine (CM).
The method that is most suitable for implementation on the CM is described in Brent and Luk (1985). It requires O(n) processors and time O(mnS) where (typically
). With this method there is no successive pre-multiplication of the matrix A so the transformation matrices don't have to move between processors. This approach to the SVD computation is an implementation of Hestenes' method (Hestenes, 1958).