info type discussions for mem-based optimizations

https://github.com/SparseBLAS/spblas-reference/blob/093068008c1f3e031fb43133d1a66051aa04b80a/notes/spmv.hpp#L10-L24

I like this idea of having an info type that is directly associated with some matrix structure and which is filled with 0 or more inspection based optimizations (which means it houses "stateful + read-only" optimizations.  I wonder if it would be possible to have our multiply functions take in some hybrid matrix_obj object which consists of either a matrix_view or a matrix_view + an associated matrix_info_t type -- used in some way like the following snippet

```c++
csr_view<T,I,O> A(...);
matrix_info_t A_info(...);
multiply_inspect( matrix_obj{A, A_info}, descriptor, x, y, /*backend stuff*/ ) 
multiply_execute( matrix_obj{A, A_info}, descriptor, x,y, /*backend stuff */)
```

or we might also skip the inspection at the cost of less performance... 

```c++
csr_view<T,I,O> A(...);
multiply_execute( matrix_obj{A}, descriptor, x,y, /*backend stuff */)
```


The benefit of this is that when we look at the sparse * sparse operation, we could have an A_info, B_info, that may contain good (read-only stateful) stuff that might be useful about A, B while creating C, and then there may be another `multistage_info_t` which is particular to the multi-stage operation (stateful + read/write data)

```c++
csr_view<T,I,O> A(...);
matrix_info_t A_info(...);

csr_view<T,I,O> B(...);
matrix_info_t B_info(...);

csr_view<T,I,O> C(...);

multiply_info_t  mult_info(); // C = A *B^T

multiply_inspect(matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), desc, /*backend stuff*/ ); // fills A_info and or B_info
multiply_execute_stage1( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage2( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
multiply_execute_stage3( matrix_obj{C}, matrix_obj{A, A_info}, transpose(matrix_obj{B,B_info}), mult_info, /*backend stuff*/ ); // fills mult_info and C
```

mult_info might house the stateful + read/write optimizations pertaining to the multiply multi-stage process ...   A_info and B_info might pertain stateful + read-only optimizations about A and or B ...

Does this idea make sense?  Does anyone see any use issues ? Is it too ugly ?   I worry that we will have too many overloads if we have A and possibly A_info etc separated as inputs ... And this allows us to distinguish between matrix inputs + info and operational info data ... 





	operation_info_t info;

	device_policy policy;

	multiply_inspect(info, policy, a, x, y);
	multiply_inspect(info, policy, transposed(a), x, y);

	// Allocate more memory for y based on `info`

	while (/* ... */) {
	multiply_execute(info, policy, a, x, y);
	// do something with y, update x...
	multiply_execute(info, policy, transposed(a), y, x);
	// Maybe do some more stuff...
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

info type discussions for mem-based optimizations #1

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

info type discussions for mem-based optimizations #1

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions