07/06/2012: Elemental 0.75-p1


New functionality

  • SVD support through the bidiagonal QR algorithm. If libFLAME is linked, a high-performance QR algorithm will be used.
  • Pseudoinverses and polar decompositions through the new SVD routine
  • QR-based Dynamically-Weighted Halley iteration (QDWH) for computing the polar decomposition, with versions for both general and Hermitian matrices
  • Support for fast expansions of packed Householder reflectors for a few cases (i.e., those needed for QR and LQ decompositions)
  • Explicit QR and LQ decompositions
  • Cheap two-norm estimates
  • ‘Norm’ now supports all DistMatrix distributions, instead of just [MC,MR]
  • DistMatrix now supports ‘viewing’ processes that do not actively own data; this makes temporarily distributing to a subset of processes (e.g., a perfect square) less of a hack
  • MakeHermitian, MakeSymmetric, and MakeReal were added
  • LUSolve was added for solving systems using an existing LU factorization, with or without partial pivoting
  • The routine Hetrmm, for forming one half of the Hermitian result L^H L or U U^H, was generalized to also support symmetric updates and the name was changed to Trtrmm
  • The routine Trdtrmm was added in order to aid in the inversion of symmetric/Hermitian-indefinite matrices and forms L^H inv(D) L or U inv(D) U^H (or the symmetric counterpart)

Performance improvements

  • Faster ApplyPackedReflectors implementations
  • Many variants of Gemm are now faster due to avoiding cache-unfriendly redistributions

Bug fixes

  • Fixed subtle issue in Householder reflection generation when the norm of the lower part of the vector was zero
  • Fixed namespacing complaints from new versions of GCC and Clang
  • Fixed mistakes in 1-2-1 and Wilkinson matrix generation
  • Fixed missing installation of FCMangle.h and cmake-dummy-lib
  • Fixed leakage of viewingGroup in the Grid destructor
  • Fixed mistake in parallel Adjoint and Transpose routines
  • Avoided bug in CMake’s enable_language OPTIONAL argument

Semantic changes

  • Shortened ‘SetLocalEntry’ and friends to the form ‘SetLocal’ in order to be more consistent with the distributed equivalent, ‘Set’
  • Expanded routines for extracting real and imaginary parts of complex data from the form ‘Real’ to ‘RealPart’
  • Shortened many redundant filenames

People involved

  • Robert van de Geijn, Field van Zee, and Gregorio Quintana Orti were involved in Elemental’s support for FLAME’s high-performance bidiagonal QR algorithm
  • Yuji Nakatsukasa, Gregorio Quintana Orti, and Robert van de Geijn were involved in the QDWH implementation
  • Bryan Marker noticed the cache-unfriendly redistributions in Gemm and Trsm
  • Jed Brown submitted patches for the FCMangle.h and cmake-dummy-lib installation issues, as well as the viewingGroup leakage in the Grid class
  • If I missed you, I apologize! Please let me know and I will add you!