%
%  file :  abstract.tex
%  desc :  summary
%
%  $Header: /home/cs/richie/tuning/papers/cvsrep/tr2003-reorder/report/abstract.tex,v 1.1.1.1 2003/01/13 22:32:09 richie Exp $
%

\begin{abstract}
We present the experimental results from applying matrix reordering
and splitting techniques to improve the performance of the sparse
matrix-vector multiply (\SMVM) operation, $y \leftarrow y + Ax$,
where $A$ is a sparse matrix and $x, y$ are dense vectors.
We build on the prior work of the \Sparsity\ system, which proposed
a register-level blocking optimization to exploit naturally occuring
dense blocks in $A$. In particular, we consider two performance
optimization techniques.
First, we apply two-level splittings of $A$, $A = A_1 + A_2$, which
better exploit variable block structure by allowing $A_1$ and $A_2$
to use different block sizes. We consider the case when $A_1$ is
blocked but $A_2$ is unblocked. We furthermore relax a \Sparsity\ requirement
that blocks be aligned on particular column boundaries, and study the
effects.
Second, we consider reordering matrix
rows and columns to \emph{create} dense block structure. Specifically,
we apply a method due to Pinar based on formulating the reordering
problem as a traveling salesman problem (TSP).
When these techniques are combined, we demonstrate speedups of up to
\BYFACTOR{2} over \Sparsity-optimized implementations of \SMVM\ on
a benchmark suite of 44 matrices and 5 evaluation platforms.
Finally, we present a preliminary evaluation of a heuristic for
choosing a block sizes, given that TSP reordering either will or will
not be applied. This heuristic selects a block size yielding performance,
in Mflop/s, within 5\% of the best block size.
\end{abstract}

%
%  $Log: abstract.tex,v $
%  Revision 1.1.1.1  2003/01/13 22:32:09  richie
%  Initial conversion from Word and first partial revision
%
%
%  eof
%