Future work
This page is for discussion of projects or ideas that might be incorporated
into the Sparse Matrix Converter at some time in the
future. Of course, they might not be, so don't get too hopeful :)
We encourage submissions of ideas -- please use the e-mail list for that.
Design philosophy
Dynamic environment
One of my goals was to keep things as dynamic as possible, because I
foresee linking the SMC to an interactive environment like Matlab or some scripting / rapid
prototyping language (hence the Common
Lisp interface, as Lisp is excellent for
rapid prototyping). Scientific computing is moving towards a more
interactive user environment, even in the parallel realm, so I think
it's helpful to provide an infrastructure that enables dynamic use.
One example of the SMC's dynamicity is that multiple matrices with
different element data types (real, complex or pattern) can coexist
simultaneously (without needing to recompile the library). Incorporation into
the extensive OSKI library
will allow flexible addition of new matrix formats without needing to
recompile, and the option to tune the matrix datatype for maximum performance.
Rich's OSKI library is an amazing piece of work, and I'm glad to have his
support on this project.
Shorter-term goals
Medium-term ideas
-
Rewriting the Harwell-Boeing sparse matrix
file parser. I discuss the relevant issues on this
page. The parsers for the other supported file formats should
also be made more fault-tolerant. I have a prototype in Lisp, but
getting it to work in C and be robust and fault-tolerant is
nontrivial.
- Integration with OSKI. There are
some design differences to work out. We'll start by extracting the
low-level kernels (e.g. conversion routines) and work from there to a
common higher-level interface.
- Round out the type converters: eventually it should be possible to convert
from any type to any other type (allowing for the possibility of going through
an intermediate type like CSR).
- Automatic file format detection: It should be pretty easy to guess
whether a file is in Matrix Market or Harwell-Boeing format, just from looking
at the first line.
Longer-term ideas
- Format conversion reasoning: Construct a graph of the available
type conversions, so that if a user specifies any two formats, the
library can convert from one to the other by following the appropriate
available paths (e.g. BCSR to CSR to JAD). This would be part of a
matrix format management system, part of which is already in place in
OSKI.
-
Out-of-core conversion routines: In most cases, format conversion
requires making a copy of the matrix. For large matrices, the
resulting memory requirements may be prohibitive. It would be helpful
if in those cases, the library could be smart enough to use an
out-of-core conversion algorithm. I imagine the user would have to
specify a place for temp files, as heavy usage of /tmp can be harmful
on some systems, and using NFS-mounted home directories could also be
a bad idea. I like this idea with an interactive system better, as
the library could notify users if the matrix is too large and ask
whether to try an out-of-core algorithm or give up. With the big
batch jobs, you might want static control of that situation, because I
think the out-of-core algorithms will be slow (basically at disk
bandwidth speeds rather than memory bandwidth speeds). But it's still
helpful to have a failsafe so that the code won't crash and waste the
job.
- Parallel matrix data structures: The BeBOP group wants to move
towards applying tuning techniques to distributed matrices as a whole
(rather than to the local components only). In that case, we would
need robust format conversion routines that could interact with
parallel file formats such as HDF. However, I'm not sure what
the user demand is for this sort of thing. If users usually generate
matrices on-the-fly and don't save them to disk, then it's probably
not worthwhile to support complicated parallel formats. Benchmarkers
can afford to do simple conversions offline.