Refactor RGF kernel for memory optimization and cleanup by AsymmetryChou · Pull Request #30 · deepmodeling/dpnegf

AsymmetryChou · 2026-06-18T12:29:58Z

This pull request focuses on improving memory management and efficiency in the recursive Green's function (RGF) calculations, particularly when running on CUDA devices. The main changes include more aggressive freeing of intermediate tensors during the RGF sweeps, the introduction of an automatic energy batch size selection based on available GPU memory, and user guidance for optimal CUDA allocator settings. These updates help reduce memory fragmentation and improve performance for large energy grids.

Memory management and efficiency improvements:

The RGF kernel (recursive_green_cal.py) now aggressively frees per-slot tensors (mat_d, mat_l, mat_u, gr_left) as soon as they're no longer needed, reducing GPU memory fragmentation and peak usage. This includes conditional retention of gr_left only when required for lesser/greater Green's function calculations. [1] [2] [3] [4]
The device property class (device_property.py) exposes a release_greenfuncs method to explicitly free Green's function storage and trigger CUDA cache cleanup between energy chunks.

Batch size and resource management:

Added an _auto_chunk_size method in the NEGF runner (NEGF.py) to automatically determine a suitable energy batch size based on available GPU memory, optimizing for both performance and memory safety. The batch size logic in negf_compute now uses this when not set by the user. [1] [2]

User guidance for CUDA configuration:

The NEGF runner now warns users if the recommended expandable_segments CUDA allocator option is not set, which helps avoid memory fragmentation for long energy grids.

Minor optimizations and code cleanup:

Avoids unnecessary tensor copies for L/U blocks in the RGF kernel, saving memory.
Cleans up unused code in current calculation and adjusts method signatures for clarity. [1] [2]

These changes collectively improve the scalability and robustness of NEGF calculations, especially for large systems and energy grids on CUDA-enabled hardware.

AsymmetryChou added 8 commits June 15, 2026 10:05

Refactor RGF kernel to reduce memory using

358a459

set need_gr_lc default to false

26955a2

add 'release_greenfuncs'

2d46fc9

remove dead slots

57a70c9

add _auto_chunk_size

5b883a7

remove unnecessary self-energy terms

245ebc8

add log warning for rgf_device

8dfb109

update batched rgf

7c1cc18

AsymmetryChou merged commit 560b47f into deepmodeling:main Jun 19, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor RGF kernel for memory optimization and cleanup#30

Refactor RGF kernel for memory optimization and cleanup#30
AsymmetryChou merged 8 commits into
deepmodeling:mainfrom
AsymmetryChou:rgf_acc

AsymmetryChou commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AsymmetryChou commented Jun 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant