BeBOP: pOSKI  v1.0.0
parallel Optimized Sparse Kernel Interface library
 All Files Functions Variables
Functions
poski_TuneMat_common.c File Reference
#include <stdio.h>
#include <stdlib.h>
#include <stdarg.h>
#include <poski/poski_threadcommon.h>
#include <poski/poski_matrixcommon.h>
#include <poski/poski_tunematcommon.h>
#include <poski/poski_matmulttype.h>
#include <poski/poski_kerneltype.h>
#include <poski/poski_malloc.h>
#include <poski/poski_print.h>

Functions

char * poski_GetMatTransforms (const poski_matstruct_t A_tunable)
 Returns the estimated number of seconds available for tuning.
int poski_TuneMat_run (poski_mat_t A_tunable)
 Tune matrix using the selected threading model.
int poski_TuneHint_Structure_run (poski_mat_t A_tunable, poski_tunehint_t hint, int k, int *r, int *c)

Detailed Description


Function Documentation

char* poski_GetMatTransforms ( const poski_matstruct_t  A_tunable)

Returns the estimated number of seconds available for tuning.

Fraction of observed workload available for tuning. Fraction of hint workload available for tuning. The estimate is based on the larger of the following two quantities:

  • Estimated time to execute the trace determined by the workload hints.
  • The actual accumulated kernel execution time so far. Determines whether the heuristic-selected data structure leads to faster execution times than the input data structure.
Returns:
1 if the tuned implementation is faster, and 0 otherwise.
Todo:

Implement this routine, given that we will need to allocate temporary vectors.

Check that the new data structure really is faster than the old.

Basic outline of this routine's implementation:

WHILE
    !IsTuned(A)
    AND tuning_time_left > 0
    AND i_heur <= NUM_HEURISTICS
DO
    LET heur = GetHeuristic( i_heur );
    LET results = NULL

    IF GetTotalCostEstimate(heur, A) <= tuning_time_left THEN
        LET t0 = GetTimer();
        results = EvaluateHeuristic( heur, A );
            // results == NULL if heuristic does not apply to A
        LET elapsed_time = GetTimer() - t0;
        tuning_time_left -= elapsed_time;
    ENDIF

    IF results THEN
        LET t0 = GetTimer();
        A_tuned = ApplyHeuristic( heur, results, A );
            // convert A to new data structure
        LET elapsed_time = GetTimer() - t0;
        tuning_time_left -= elapsed_time;

        A = ChooseFastest( A, A_tuned, A->trace );
    ENDIF

    i_heur = i_heur + 1;
DONE
Todo:
The current implementation does not try to re-tune if already tuned.
Parameters:
[in]A_tunableValid matrix handle.
Returns:
A newly allocated string representing the transformation/data structure that has been applied to $A$.
Note:
The caller must free the returned string.
int poski_TuneHint_Structure_run ( poski_mat_t  A_tunable,
poski_tunehint_t  hint,
int  k,
int *  r,
int *  c 
)

Attempt to allocate space for this list. If it fails, we still record that there is a mix of block sizes present, but do not record the list. This behavior is OK because the implementation is free to regard/disregard hints as desired.