Skip to content

info type discussions for mem-based optimizations #1

@spencerpatty

Description

@spencerpatty

operation_info_t info;
device_policy policy;
multiply_inspect(info, policy, a, x, y);
multiply_inspect(info, policy, transposed(a), x, y);
// Allocate more memory for y based on `info`
while (/* ... */) {
multiply_execute(info, policy, a, x, y);
// do something with y, update x...
multiply_execute(info, policy, transposed(a), y, x);
// Maybe do some more stuff...
}

I like this idea of having an info type that is directly associated with some matrix structure and which is filled with 0 or more inspection based optimizations (which means it houses "stateful + read-only" optimizations. I wonder if it would be possible to have our multiply functions take in some hybrid matrix_obj object which consists of either a matrix_view or a matrix_view + an associated matrix_info_t type -- used in some way like the following snippet

csr_view<T,I,O> A(...);
matrix_info_t A_info(...);
multiply_inspect( matrix_obj{A, A_info}, descriptor, x, y, /*backend stuff*/ ) 
multiply_execute( matrix_obj{A, A_info}, descriptor, x,y, /*backend stuff */)

or we might also skip the inspection at the cost of less performance...

csr_view<T,I,O> A(...);
multiply_execute( matrix_obj{A}, descriptor, x,y, /*backend stuff */)

The benefit of this is that when we look at the sparse * sparse operation, we could have an A_info, B_info, that may contain good (read-only stateful) stuff that might be useful about A, B while creating C, and then there may be another multistage_info_t which is particular to the multi-stage operation (stateful + read/write data)

csr_view<T,I,O> A(...);
matrix_info_t A_info(...);

csr_view<T,I,O> B(...);
matrix_info_t B_info(...);

csr_view<T,I,O> C(...);

multiply_info_t  mult_info(); // C = A *B^T

multiply_inspect(matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), desc, /*backend stuff*/ ); // fills A_info and or B_info
multiply_execute_stage1( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage2( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage3( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C

mult_info might house the stateful + read/write optimizations pertaining to the multiply multi-stage process ... A_info and B_info might pertain stateful + read-only optimizations about A and or B ...

Does this idea make sense? Does anyone see any use issues ? Is it too ugly ? I worry that we will have too many overloads if we have A and possibly A_info etc separated as inputs ... And this allows us to distinguish between matrix inputs + info and operational info data ...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions