Skip to content

Matrix compression for federated learning broadcasts#2524

Open
nirvanjhurreepriv wants to merge 16 commits into
apache:mainfrom
nirvanjhurreepriv:feature/compression
Open

Matrix compression for federated learning broadcasts#2524
nirvanjhurreepriv wants to merge 16 commits into
apache:mainfrom
nirvanjhurreepriv:feature/compression

Conversation

@nirvanjhurreepriv

Copy link
Copy Markdown

Authors (for submission)

  • Charansurya Udaysingh Jhurree, 488213
  • Deniz Durmusoglu, 530387
  • Yagiz Bartu Arslan, 529386

Summary

Implements matrix compression techniques to reduce communication overhead in SystemDS's federated learning backend.

Changes

  • New package: org.apache.sysds.runtime.compress
    • TopK Sparsification (up to 66x compression)
    • Probabilistic Quantization (4-16x compression, unbiased)
    • CompressionConfig, CompressionFactory, MatrixCompressor interface
  • Integration into FederationMap.broadcast() with safe fallback
  • 19 unit tests, all passing

Testing

mvn test -Dtest=TopKCompressorTest,ProbabilisticQuantizationCompressorTest

@ywcb00 ywcb00 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the PR. :)
I added some comments inline. Please have a look at them and try to make the corresponding modifications in the code.
Also, please make sure that the other GitHub workflows succeed, especially the Java Tests and the Coding Style workflows.
Thank you and all the best,
David

Comment on lines +5 to +6
*
* @author Nirvan C. UdaysinghJhurree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Author information is already stored by git. No need to add it explicitly to the function header.

Comment on lines +5 to +6
*
* @author Nirvan C. Udaysingh Jhurree

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove author information.

Comment on lines +23 to +24
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

Comment on lines +9 to +10
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

Comment on lines +16 to +17
*
*

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove empty lines.

*
*
*/
public class ProbabilisticQuantizationCompressorTest {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can extend from the AutomatedTestBase class and leverage the utility functions in org.apache.sysds.test.TestUtils.

[This applies to both test classes that are added with this PR]


// Normalize to [0, 1]
double normalized = (value - min) / (max - min);
normalized = Math.max(0.0, Math.min(1.0, normalized)); // Clamp

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this line? Can the value be smaller than zero or larger than 1 after normalizing it to [0, 1]?

// Find bounding level indices
double scaled = normalized * (levels - 1);
int lowerIdx = (int) scaled;
int upperIdx = Math.min(lowerIdx + 1, levels - 1);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the Math.min operation here?

Comment on lines +85 to +87
public org.apache.sysds.runtime.matrix.data.MatrixBlock decompress(CompressedMatrix compressed)
throws org.apache.sysds.runtime.compress.exceptions.DecompressionException {
return (org.apache.sysds.runtime.matrix.data.MatrixBlock) compressed.getCompressedData();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import the MatrixBlock and use it as MatrixBlock instead of writing the full path.

Comment on lines +52 to +54
case ONE_BIT_CS:
throw new UnsupportedOperationException(
"1-Bit Compressed Sensing not yet implemented");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you still planning on implementing compressed sensing?
If yes, here is a reference that might be useful: https://doi.org/10.1109/CISS.2008.4558487
Otherwise, please remove this case.

@github-project-automation github-project-automation Bot moved this from In Progress to In Review in SystemDS PR Queue Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

2 participants