Add a flag to indicate whether the matrix has a full (all non-zero) diagonal so that the triangular solve kernel does not have to check this condition explicitly.
This implementation currently assumes our pessimistic upper-bound on the heuristic evaluation cost of 40 x (the time to stream through the matrix), and should be changed to use a build-time benchmark.
The output properties data structure actually defines a more general property about the diagonal, namely, that it is all ones. However, the available input matrix properties only allow the user to specify whether or not there is an implicit unit diagonal. Thus, it is possible that the user could create an input matrix with an explicit unit diagonal, but this condition is not checked when wrapping the data structure. It might be desirable to do this to make optimized triangular solve for the unit diagonal case more efficient.
Similarly, the oski_matCSR_t data structure has "is_upper" and "is_lower" flags, which could be set even if the user asserts that the matrix has a "general" pattern.
What is the best way to initialize the dense matrix? For benchmarking, it seems sufficient to initialize all entries to 'tiny' values (here, ), which is faster than calling the random number generator.
The case of sorted indices assumes not only that the indices are sorted, but also that there is a unique diagonal element. Do we need to fix this? The general ordering case makes no such assumption.
The case of sorted indices assumes not only that the indices are sorted, but also that there is a unique diagonal element. Do we need to fix this? The general ordering case makes no such assumption.
For efficiency, this routine does not attempt to pre-scan the matrix data structure and ensure there are no zero diagonals. At least for CSR and CSC input matrices, we should add some kind of check somewhere (e.g., at matrix handle creation time). A similar to-do appears elsewhere in this source.
MBCSR currently has an overly strong interdependence on the BCSR data structure as defined in include/oski/BCSR/format.h because MBCSR contains a pointer to a BCSR object, and moreover initializes the fields of the BCSR object explicitly. We should weaken this dependence by implementing the submatrix instantiation functionality (see the defined but unused structure, oski_submat_t).
Generated on Wed Sep 19 16:41:23 2007 for BeBOP Optimized Sparse Kernel Interface Library by
1.4.6