HPL Errata - Bugs
Issues fixed in Version 2.1, October 26th, 2012The output now reports exact time stamps before and after the execution of the solver function pdgesv() was run. This could allow for accurate accounting of running time for data center management purposes. For example as reporting power consumption. This is important for the Green500 project.
Fixed an out-of-bounds access to arrays in the HPL_spreadN() and HPL_spreadT() functions. This may cause segmentation fault signals. It was reported by Stephen Whalen from Cray.
Issues fixed in Version 2.0, September 10th, 2008Gregory Bauer found a problem size corresponding to the periodicity of the pseudo-random matrix generator used in the HPL timing program. This causes the LU factorization to detect the singularity of the input matrix as it should have.
A problem size of 2^17 = 131072 causes columns 14 modulo 2^14 (i.e. 16384) (starting from 0) to be bitwise identical on a homogeneous platform. Every problem size being a power of 2 and larger than 2^15 will feature a similar problem if one searches far enough in the columns of the square input matrix.
The pseudo-random generator uses the linear congruential algorithm: X(n+1) = (a * X(n) + c) mod m as described in the Art of Computer Programming, Knuth 1973, Vol. 2. In the HPL case, m is set to 2^31.
It is very important to realize that this issue is a problem of the testing part of the HPL software. The numerical properties of the algorithms used in the factorization and the solve should not be questioned because of this. In fact, this is just the opposite: the factorization demonstrated the weakness of the testing part of the software by detecting the singularity of the input matrix.
This issue of the testing program is not easy to fix. This pseudo-random generator has very useful properties despite this. It is thus currently recommended to HPL users willing to test matrices of size larger than 2^15 to not use power twos.
This issue has been fixed by changing the pseudo-random matrix generator. Now the periodicity of the generator is 2^64.
Issues fixed in Version 1.0b, December 15th, 2004When the matrix size is such that one needs more than 16 GB per MPI rank, the intermediate calculation (mat.ld+1) * mat.nq in HPL_pdtest.c ends up overflowing because it is done using 32-bit arithmetic. This issue has been fixed by typecasting to size_t; Thanks to John Baron.
Issues fixed in Version 1.0a, January 20th, 2004The MPI process grid numbering scheme defaults now to row- major ordering. This option can now be selected at run time.
The inlined assembly timer routine that was causing the compilation to fail when using gcc version 3.3 and above has been removed from the package.
Various building problems on the T3E have been fixed; Thanks to Edward Anderson.