Feat transpose multiindex#90
Conversation
Documentation build overview
81 files changed ·
|
75ede5a to
e2fd7e8
Compare
|
Thanks for the PRs! I don't think I'll be able to review them this week but hopefully the next one. I am also happy to have users contributing as it has been a while since I used it regularly so I am maintaining and making releases but it is a bit frozen. |
|
No problem, I am currently integrating this package into my project and had to make a fork with several adjustments to make it work for my use case, but I'd prefer to be in sync with upsteam :) There is no hurry for me, just thought it would be nice to contribute. |
OriolAbril
left a comment
There was a problem hiding this comment.
Sorry for the extreme slowness. I added a comment on dim->coord pairing behaviour and suggested an extension to the test but all the other fixes should definitely go in without extra changes.
| if dims is None: | ||
| dims = _attempt_default_dims("matrix_transpose", da.dims) | ||
| dim1, dim2 = dims | ||
| return da.swap_dims({dim1: dim2, dim2: dim1}).transpose(..., *dims) |
There was a problem hiding this comment.
I do think that the function should work even when the dimensions to transpose have a multiindex, but I am not sure about the handling of coords. I would be very interested in having your feedback and usecases. I had only considered the possibility of using this function for square matrices where there is basically a repeated dimension so I had dim+dim2 or xyz+xyz_bis with same length and same coordinate values.
The function does already work on non multiindex but different length dimensions, and had I realized before, my intuition would have been to remove the coordinate values from the output in such cases. This is what sort does.
What do/would you use this changing of dim->coords pairings for?
For example, if I add coordiate values to one of the example datasets:
matrices = tutorial.generate_matrices_dataarray()
matrices = matrices.assign_coords({name: [f"{name}{i}" for i in range(matrices.sizes[name])] for name in matrices.dims})<xarray.DataArray (batch: 10, experiment: 3, dim: 4, dim2: 4)> Size: 4kB
0.4345 2.96 0.2627 0.4519 0.5507 0.1642 ... 0.2144 0.9432 0.5874 0.5385 2.987
Coordinates:
* batch (batch) <U6 240B 'batch0' 'batch1' ... 'batch8' 'batch9'
* experiment (experiment) <U11 132B 'experiment0' 'experiment1' 'experiment2'
* dim (dim) <U4 64B 'dim0' 'dim1' 'dim2' 'dim3'
* dim2 (dim2) <U5 80B 'dim20' 'dim21' 'dim22' 'dim23'
and after matrix_transpose (no multiindex, I don't think it is relevant in this particular case):
matrix_transpose(matrices, dims=["batch", "experiment"])<xarray.DataArray (dim: 4, dim2: 4, batch: 3, experiment: 10)> Size: 4kB
0.4345 0.4027 3.489 0.3505 0.2236 0.744 ... 0.1844 3.124 0.4926 0.255 2.987
Coordinates:
* dim (dim) <U4 64B 'dim0' 'dim1' 'dim2' 'dim3'
* dim2 (dim2) <U5 80B 'dim20' 'dim21' 'dim22' 'dim23'
* batch (batch) <U11 132B 'experiment0' 'experiment1' 'experiment2'
* experiment (experiment) <U6 240B 'batch0' 'batch1' ... 'batch8' 'batch9'
Having the batch dimension now hold the experiment coordinates seems strange, independently of them having a multiindex. I was also curious about the constraint of both multiindexes having the same levels, I guess this is so the levels can also be renamed, any thoughts on swapping only the top multiindex name and not the levels?
There was a problem hiding this comment.
Thanks for the review, and no worries on the timing! I've added a new test_supermatrix_multiindex_transpose that demonstrates the use case I had in mind.
The context where I ran into this is when constructing a supermatrix from smaller matrices. For example, given:
- A with dims
(i, j) - B with dims
(k, l)
We can formS = A ⊗ B^Tvia an outer productA * Bfollowed bystack(row=("i", "l"), col=("j", "k")). This gives us a single large square matrix with two multi-index dimensionsrowandcol. Once we have S, we naturally want to compute things like S^T or S^T S — which is where matrix_transpose comes in.
Later, I want to decompose the result of some computation back into the original (i, j) dims. Tbh, I have not considered what happens in non-square cases, just followed what the existing code seemed to handle. To my mind, the whole reason behind using this library is to keep using pure xarray syntax and keeping the clear declarative code in linalg scripts.
Manipulating dimensions like this is kind of wonky in xarray, and depending what a person wants to achieve, requires different sequences of renames/dropping vars etc. What I would expect of an xarray related linalg.transpose is to offer a transposition in the linear algebra sense, meaning that a vector from one space (with its base/coordinates) is swapped with another space (so essentially, their coordinates are swapped).
From ML standpoint, the batch/experiment swapping seems weird, but the transpose is for linear algebra and this choice of keeping coordinates (vector space basis) seems appropriate to me.
There was a problem hiding this comment.
As for non-square matrices, I would say the same linear algebra argument applies (moving between vector spaces). However, I did not deal with such cases much so I won't insist on it due to lack of familiarity.
There was a problem hiding this comment.
Another case for not dropping coordinates is that doing so is fairly straightforward in xarray and can be easily done explicitly (just chain a drop_vars), while restoring them after an implicit drop is more involved/cumbersome, since you need to extract them from the prior object and then assign again, so 2 steps vs 1.
There was a problem hiding this comment.
As for the final question, I do feel like swapping just the MultiIndex name would be confusing, since you would rename it but get an actually different beast, which breaks semantics.
My algebraic intuition, as outlined, is that we swap two spaces, which are composite spaces:
A=U ⊗ V and B = W ⊗ Y. Should we skip swapping the subspaces, you would swap the definitions of your spaces and get B = U ⊗ V and A = W ⊗ Y, which to my mind breaks the semantics. Even outside of algebra, a stacked coordinate would change its meaning and I don't think this function should meddle with semantics imposed by the user who defined the stacked coordinate and attributed some meaning to a particular way of stacking.
805e7ed to
7534528
Compare
7534528 to
ec0a803
Compare
|
Sorry for initially not replying in full to your comment, I had to ponder on it myself. Hope my response clarifies my reasoning 🙌 |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #90 +/- ##
==========================================
- Coverage 91.46% 91.33% -0.14%
==========================================
Files 6 6
Lines 844 854 +10
==========================================
+ Hits 772 780 +8
- Misses 72 74 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
OriolAbril
left a comment
There was a problem hiding this comment.
I doubt it will happen but if someone complains we can revisit. For square matrices with multiindex I was mostly convinced beforehand, for the rest I am also convinced enough.
|
Awesome, thank you! Are you planning to make a release in the near future? |
Yes, probably at some point next week. Hopefully after also properly reviewing the dask usage issue |
Thank you for the amazing package! I was happy to see the linalg operations magically working xarray style with minimal effort, even with stacked dimensions! Unfortunately that was not the case for
matrix_transpose, so here is a fix that makes it not error out when the dimensions being swapped both use aMultiIndex.Additionally, the method was the only one, where
dimswere acutally not optional, so I fixed this.Also fixed a small annoyance where
pinvandmatmulwere not members of the accessor.