% % file : abstract.tex % desc : summary % % $Header: /home/cs/richie/tuning/papers/cvsrep/tr2003-reorder/report/abstract.tex,v 1.1.1.1 2003/01/13 22:32:09 richie Exp $ % \begin{abstract} We present the experimental results from applying matrix reordering and splitting techniques to improve the performance of the sparse matrix-vector multiply (\SMVM) operation, $y \leftarrow y + Ax$, where $A$ is a sparse matrix and $x, y$ are dense vectors. We build on the prior work of the \Sparsity\ system, which proposed a register-level blocking optimization to exploit naturally occuring dense blocks in $A$. In particular, we consider two performance optimization techniques. First, we apply two-level splittings of $A$, $A = A_1 + A_2$, which better exploit variable block structure by allowing $A_1$ and $A_2$ to use different block sizes. We consider the case when $A_1$ is blocked but $A_2$ is unblocked. We furthermore relax a \Sparsity\ requirement that blocks be aligned on particular column boundaries, and study the effects. Second, we consider reordering matrix rows and columns to \emph{create} dense block structure. Specifically, we apply a method due to Pinar based on formulating the reordering problem as a traveling salesman problem (TSP). When these techniques are combined, we demonstrate speedups of up to \BYFACTOR{2} over \Sparsity-optimized implementations of \SMVM\ on a benchmark suite of 44 matrices and 5 evaluation platforms. Finally, we present a preliminary evaluation of a heuristic for choosing a block sizes, given that TSP reordering either will or will not be applied. This heuristic selects a block size yielding performance, in Mflop/s, within 5\% of the best block size. \end{abstract} % % $Log: abstract.tex,v $ % Revision 1.1.1.1 2003/01/13 22:32:09 richie % Initial conversion from Word and first partial revision % % % eof %