The “Axpy” interface¶
The Axpy interface is Elemental’s version of the PLAPACK Axpy interface, where
Axpy is derived from the BLAS shorthand for \(Y := \alpha X + Y\)
(Alpha X Plus Y). Rather than always requiring users to manually fill their
distributed matrix, this interface provides a mechanism so that individual processes
can independently submit local submatrices which will be automatically redistributed
and added onto the global distributed matrix
(this would be LOCAL_TO_GLOBAL
mode). The interface also allows for the reverse:
each process may asynchronously request arbitrary subset of the global distributed
matrix (GLOBAL_TO_LOCAL
mode).
Note
The catch is that, in order for this behavior to be possible, all of the
processes that share a particular distributed matrix must synchronize at the
beginning and end of the Axpy interface usage (these synchronizations correspond
to the Attach
and Detach
member functions). The distributed matrix
should not be manually modified between the Attach
and Detach
calls.
An example usage might be:
#include "elemental.hpp"
using namespace elem;
...
// Create an 8 x 8 distributed matrix over the given grid
DistMatrix<double> A( 8, 8, grid );
// Set every entry of A to zero
MakeZeros( A );
// Open up a LOCAL_TO_GLOBAL interface to A
AxpyInterface<double> interface;
interface.Attach( LOCAL_TO_GLOBAL, A );
// If we are process 0, then create a 3 x 3 identity matrix, B,
// and Axpy it into the bottom-right of A (using alpha=2)
// NOTE: The bottom-right 3 x 3 submatrix starts at the (5,5)
// entry of A.
// NOTE: Every process is free to Axpy as many submatrices as they
// desire at this point.
if( grid.VCRank() == 0 )
{
Matrix<double> B;
Identity( B, 3, 3 );
interface.Axpy( 2.0, B, 5, 5 );
}
// Have all processes collectively detach from A
interface.Detach();
// Print the updated A
Print( A, "Distributed A" );
// Reattach to A, but in the GLOBAL_TO_LOCAL direction
interface.Attach( GLOBAL_TO_LOCAL, A );
// Have process 0 request a copy of the entire distributed matrix
//
// NOTE: Every process is free to Axpy as many submatrices as they
// desire at this point.
Matrix<double> C;
if( grid.VCRank() == 0 )
{
Zeros( C, 8, 8 );
interface.Axpy( 1.0, C, 0, 0 );
}
// Collectively detach in order to finish filling process 0's request
interface.Detach();
// Process 0 can now locally print its copy of A
if( g.VCRank() == 0 )
Print( C, "Process 0's local copy of A" );
The output would be
Distributed A
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0
0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 2
Process 0's local copy of A
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 2 0 0
0 0 0 0 0 0 2 0
0 0 0 0 0 0 0 2
- template<>
-
class
AxpyInterface
<T>¶ -
AxpyInterface
()¶ Initialize a blank instance of the interface class. It will need to later be attached to a distributed matrix before any Axpy’s can occur.
-
AxpyInterface
(AxpyType type, DistMatrix<T, MC, MR> &Z)¶ Initialize an interface to the distributed matrix
Z
, wheretype
can be eitherLOCAL_TO_GLOBAL
orGLOBAL_TO_LOCAL
.
-
AxpyInterface
(AxpyType type, const DistMatrix<T, MC, MR> &Z)¶ Initialize an interface to the (unmodifiable) distributed matrix
Z
; sinceZ
cannot be modified, the only sensicalAxpyType
isGLOBAL_TO_LOCAL
. TheAxpyType
argument was kept in order to be consistent with the previous routine.
-
void
Attach
(AxpyType type, DistMatrix<T, MC, MR> &Z)¶ Attach to the distributed matrix
Z
, wheretype
can be eitherLOCAL_TO_GLOBAL
orGLOBAL_TO_LOCAL
.
-
void
Attach
(AxpyType type, const DistMatrix<T, MC, MR> &Z)¶ Attach to the (unmodifiable) distributed matrix
Z
; as mentioned above, the only sensical value oftype
isGLOBAL_TO_LOCAL
, but theAxpyType
argument was kept for consistency.
-
void
Axpy
(T alpha, Matrix<T> &Z, int i, int j)¶ If the interface was previously attached in the
LOCAL_TO_GLOBAL
direction, then the matrix\alpha Z
will be added onto the associated distributed matrix starting at the \((i,j)\) global index; otherwise \(\alpha\) times the submatrix of the associated distributed matrix, which starts at index \((i,j)\) and is of the same size asZ
, will be added ontoZ
.
-
void
Axpy
(T alpha, const Matrix<T> &Z, int i, int j)¶ Same as above, but since
Z
is unmodifiable, the attachment must have been in theLOCAL_TO_GLOBAL
direction.
-
void
Detach
()¶ All processes collectively finish handling each others requests and then detach from the associated distributed matrix.
-