arpack-ng  

Issue 1259: Invalid results on OSX 10.8/10.7 with Accelerate

Reported by pv, Jul 1, 2013

Arpack NG 3.1.3 produces invalid results on OSX 10.8 (and 10.7)
when linked with Apple's Accelerate/vecLib BLAS/LAPACK combination.

A small test program is here: https://gist.github.com/pv/5873146
It fails only on OSX 10.8/7 and only when linked with Accelerate.

What is not clear in this case is whether this is an ARPACK bug,
e.g. some numerical instability issue, or whether it is a bug in 
Accelerate. That the bug doesn't appear on Intel MKL or any other
LAPACK/BLAS combination points towards the latter. A bug report has
also been filed for Apple, but in case it's an ARPACK issue I'll
also file it here.

Comment 1 by Michael Wimmer, Jul 22, 2013

I'd like to add that this is not only a Mac-specifc problem. It 
turns out that I can reproduce this error also on a Linux machine, 
when linking with Lapack version 3.2.1. For versions 3.2.2 and 
higher everything seems to work.

The problem in fact arises in DSEUPD (the output of DSAUPD seemed to 
be identical for all Lapack versions). I suspect that the cause is 
the Lapack routine DGEQR2: at least, this routine used by DSEUPD was 
changed from 3.2.1 to 3.2.1: see bug0051 in 
http://www.netlib.org/lapack/lapack-3.2.2.html (however the bug 
report is about DGEES which used DGEQR2; the Lapack authors claimed 
that DGEQR2 in principle did work right, just some "-" 
sign conventions changed).

It's not clear to me if the bug is due to ARPACK or LAPACK, but that 
information might help nailing it down.

Comment 2 by pv, Jul 22, 2013

I checked some time ago the QR decomposition returned by DGEQR2 for 
the matrix used in DSEUPD, and while it was different with 
Accelerate (the second Householder vector looks a bit funky), it 
still seemed to be a valid QR decomposition.

Comment 3 by Michael Wimmer, Jul 22, 2013

That QR decomposition is interesting! In fact, I now start to 
believe that this is an ARPACK bug:

If I remove the "eigenvector purification step", I get in 
all cases the good result.
 
Provided that I read the code correctly, for the "purification 
step" the last entries in the Householder vectors are used 
(stored in the last row if the ncv*ncv matrix stored in workl(iq)). 
This is fishy by itself, as the Householder vectors without the 
corresponding multipliers (stored in workl(iw+ncv)) do not mean 
anything.

In particular, what confuses me completely is that dseupd performs a 
QR decomposition of a matrix that is in my opinion always orthogonal 
to begin with: it is the eigenvector matrix of a symmetric 
tridiagonal matrix (computed by dsteqr) modulo some column 
interchanges. Hence, that matrix should always be orthogonal ...

One sees this for the particular example in the qrcheck.f90 program 
uploaded by pv. The matrix to be orthogonalized is already 
orthogonal. This corresponds to some tau's (Householder multipliers) 
being either -1 (for lapack versions > 3.2.1) or 0 (for lapack 
version 3.2.1). 

For tau=-1, the last entry of the Householder vector is always 0 if 
I'm right, and such this mentioned "purification step" 
doesn't do anything. If tau=0, the last entry can be anything (which 
happens in the current example to be a large number), and 
consequently, the purification step makes a large error.

Now, that analysis may very well be incomplete, I neither fully 
understand the algorithm, nor the code ...

Comment 4 by Michael Wimmer, Jul 25, 2013

I think I do have an understanding of the problem now. It clearly is 
an ARPACK bug which potentially could have more consequences (or 
maybe not, see below).

I will document my thoughts in detail below (most of it is just 
what's written in the user guide) to have it all in one place. 
ARPACK is pretty complicated stuff, and so I'd like to be rather 
explicit.

According to the Arpack user guide, DSEUPD does the following (with 
regard to the eigenvectors):

1. It recieves the input of DSAUPD, which found an Arnoldi
   factorization of the form A V = V T + resid e_NCV^T 
   where A is the original matrix (or, in general, the 
   linear operator to be diagonalized, e.g.  in 
   shift-and-invert mode), V an orthogonal
   matrix, T a tridiagonal matrix, resid a vector and 
   e_NCV the NCVth unit vector.
2. DSEUPD diagonalizes T (with eigenvalues theta); its 
   eigenvectors form the orthogonal matrix Q = 
   (q_1|...|q_NCV) (with q_i the columns of Q)
3. The eigenvectors of A are found (after possibly 
   sorting the eigenvalues) as the first few columns of
   V Q.
4. Finally, the "purification step" consists of applying 
   A once to the eigenvector x_i = V q_i and then 
   normalizing. This can be done without applying A 
   directly, as A x_i = theta_i x_i + resid e_NCV^T q_i 
   (from the Arnoldi factorization) [User guide 4.5.2]

This algorithm is implemented in DSEUPD as follows:

1. The tridiagonal matrix T is stored in workl(ihd) and 
   workl(ihb) (diagonal and subdiagonal)
2. The tridiagonal matrix T is diagonalized by the LAPACK 
   routine DSTEQR; the NCV x NCV orthogonal matrix Q is 
   stored in workl(iq)
3. The product V Q is formed as follows: First, a QR 
   decomposition of W is computed by DGEQR2; since Q is 
   already orthogonal to machine precision, this 
   corresponds to replacing Q by a representation of  
   Householder reflections. According to the user guide, 
   this has the advantage that the multiplication 
   (application of the Householder reflections) can be 
   done inplace. (This step confused me at first, but it 
   makes sense if you assume that memory is the main 
   problem; otherwise it's a pretty expensive way
   to do a matrix multiplication) The Householder vectors 
   are stored in the lower triangle of 
   workl(iq), *overwriting the  previously stored matrix 
   Q*. The multipliers (tau) are stored in workl(iw+ncv).
4. Finally, note that the purification step involves only 
   the last row of the matrix Q (because of the unit 
   vector e_NCV). The intention of ARPACK is apparently 
   to apply this step as an update by an outer
   product (using DGER) to the eigenvectors. For this, 
   the last row of the matrix stored in workl(iq) is 
   used: workl(iq+k*ldq+ncv-1), and a division by theta_i 
   is done (note that in principle, one should
   really properly normalize here, the division by theta  
   is only an approximation.). This *would* be the 
   correct thing, *if* workl(iq) would still hold the 
   matrix Q. However, it has been overwritten in step 3 
   by the Householder vectors.

   This, I believe, is the bug that causes the problem.


Essentially, what I am saying is that the eigenvector purification 
step never worked.

So why did this bug never show before? For this it is necessary to 
consider the criterion ARPACK uses to assess convergence [user guide 
4.6]. Essentially, it is that an eigenvector x_i is considered 
converged if ||resid|| |e_NCV^T q_i| is almost zero. But that means, 
that also the term resid e_NCV^T q_i in the purification step is 
very small (either resid small, or the last row of Q very small). 

The problematic case now is if the last row if Q is small. In the 
standard QR decomposition in all LAPACK versions except 3.2 and 
3.2.1 the Householder reflector is done in such a way to maximize 
the value of tau, so that tau>0 (except if the vector to 
eliminate is exactly 0). Since the Householder vector is 
proportional to the the column that it should eliminate, this means
that the last entry in the Householder vector is also almost zero. 
Hence, the eigenvector purification step doesn't do anything.

In the case of Lapack 3.2 and 3.2.1 however, it can happen that tau 
is chosen almost 0. Then, the Householder vectors will have 
sometimes finite entries everywhere, in particular in the last row, 
*although* the last row of Q was essentially zero. The QR 
decomposition then itself still is fine (as shown by pv's example 
above)! The bug in the purification step however then messes up the 
eigenvectors.

Since the exact conventions how a Householder reflector is 
represented might be different for different implementations of the 
LAPACK standard, the bug should definitely be fixed (even if 
apparently only Apple uses Lapack 3.2.1 any more).

Finally one aspect that I haven't been able to pin down precisely:
From the arguments above it is clear that the eigenvector 
purification step is not necessary for a regular eigenproblem (as 
the convergence criterion also implies that the purification step 
leaves the eigenvectors the same up to machine precision). In fact 
this is also what the user guide states: the purification step can 
be necessary if one has a generalized eigenproblem with a 
non-invertible matrix M. (Then, I guess, the norms ||..|| are taken 
with respect to M, and ||resid||=0 might not imply resid=0). From 
what I said above, we would expect to see a wrong result *for any 
Lapack version*.

However, right now I have failed to come up with an example where 
the eigenvector purification step would be needed at all.

Comment 5 by pv, Jul 25, 2013

That explanation sounds to correct to me. The code indeed doesn't do 
what is explained in the user guide.

The last row of the eigenvector matrix Q is reconstructed 
immediately after the QR decomposition. The work space used to store 
the TAU values of the QR decomposition seems to be unused in the 
latter part of the routine, and could be used to store a copy of the 
last row. Tentative patch doing this: 
https://github.com/pv/arpack-ng/compare/fix-1259
Needs still double-checking, however, but at least this makes 
Scipy's test suite to pass again.

Comment 6 by Michael Wimmer, Jul 26, 2013

The code in the above commit does look correct.

There is one issue though which still bothers me, and that is the 
normalization of the vectors after applying the purification step. 
The eigenvectors x_i are divided by theta_i after applying A. That 
gives the correct normalization only if the effect of purification 
is neglible anyways ... 

It would be really good to see an example where the purification 
step actually does something significant.

Comment 7 by pv, Jul 26, 2013

From the literature cited in the ARPACK user guide, I get the 
impression that this correction is mostly relevant for generalized 
eigenvalue problems when the B-matrix is badly conditioned.

An example case adapted from Meerbergen & Spence (1997) is 
attached. The correction term applied in DSEUPD is of order 1e-3, 
and can be made bigger with larger `n` and `m`.

However, it also seems that ARPACK is not behaving very robustly for 
these problems. The correction may help a bit, but it's also seems 
quite easy to get bogus results without warning for this class of 
problems.

Comment 8 by Jordi Gutiérrez Hermoso, Aug 28, 2013

Thanks, I've pulled in your change.
Status: Fixed

Comment 9 by ankit saini, Jul 18, 2019

I am sure after come and open it your issue will be more easy 
because this https://scrabblewordfinder.me describe various type of 
the new thoughts easily.

Created: 7 years 2 months ago by pv

Updated: 1 year 2 months ago

Status: Fixed

Followed by: 3 persons

Labels:
Type:Defect
Priority:Medium