feat(pt): add embedding sub cli for pt models#5571
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (3)
📝 WalkthroughWalkthroughAdds a new ChangesEmbedding extraction feature
Sequence Diagram(s)sequenceDiagram
participant User
participant main_py as main.py / entrypoints/main.py
participant embedding_py as entrypoints/embedding.py
participant DeepEval as DeepEval (deep_eval.py)
participant PTDeepEval as PyTorch DeepEval (pt/infer/deep_eval.py)
participant Model as SeZMModel / DPModel
User->>main_py: dp embed -m model.pt -s ./data -o out.h5 --dtype fp32
main_py->>embedding_py: embedding(model, system, output, dtype="fp32")
loop per discovered system
embedding_py->>embedding_py: DeepmdData(system_path)
embedding_py->>DeepEval: eval_embedding(coords, cells, atype, dtype="fp32")
DeepEval->>DeepEval: _standard_input(coords, cells, atype, mixed_type)
DeepEval->>PTDeepEval: deep_eval.eval_embedding(coords, cells, atype, dtype="fp32")
PTDeepEval->>Model: model.forward_embedding(coord, atype, box)
Model-->>PTDeepEval: descriptor, atomic_feature, structural_feature
PTDeepEval-->>DeepEval: float32 numpy arrays
DeepEval-->>embedding_py: (descriptor, atomic_feature, structural_feature)
embedding_py->>embedding_py: write HDF5 group + gzip datasets + attributes
end
embedding_py-->>User: out.h5
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/entrypoints/embedding.py`:
- Around line 116-120: The datalist.read().splitlines() call in the datafile
parsing section keeps blank lines, which can result in empty strings being
passed to DeepmdData and causing failures. Filter out blank lines when reading
from the datafile by adding a filter condition that excludes empty or
whitespace-only entries from the all_sys list. This ensures only valid system
entries are processed while preserving the original structure of the code.
In `@deepmd/pt/infer/deep_eval.py`:
- Around line 1208-1260: The eval_embedding method directly calls
self.dp.model["Default"].forward_embedding(...) which bypasses the data modifier
that would normally be applied through ModelWrapper. Add a guard condition that
checks if self.modifier is not None before proceeding with the forward_embedding
call, and raise an appropriate error or fail fast if a modifier exists, since
modifier-aware embedding processing is not yet implemented for this code path.
In `@deepmd/pt/model/atomic_model/linear_atomic_model.py`:
- Around line 341-349: Add explicit `strict=False` parameter to both zip() calls
in the list comprehensions to satisfy Ruff B905. The first zip() call combines
self.get_model_rcuts() and self.get_model_nsels(), and the second zip() call
combines self.mixed_types_list, raw_nlists, and self.get_model_sels(). Update
both invocations to include the strict parameter to explicitly indicate that the
iterables may have different lengths.
In `@deepmd/pt/model/task/fitting.py`:
- Around line 855-868: The atomic_feature tensor allocation in the
return_atomic_feature block accesses self.neuron[-1] without checking if the
neuron list is empty. When neuron=[] is configured, this causes an IndexError.
Fix this by replacing the hardcoded self.neuron[-1] with a conditional check
that determines the correct feature dimension. If self.neuron is not empty, use
self.neuron[-1], otherwise determine the feature width from an alternative
source such as the output shape of ll.call_until_last(xx) or from the
descriptor's feature dimension to ensure the atomic_feature tensor is allocated
with the correct size regardless of neuron configuration.
In `@doc/inference/embedding.md`:
- Line 44: On line 44 in the embedding.md file, remove the shell prompt marker
`$ ` from the command to comply with markdownlint rule MD014. Change `$ dp embed
--help` to just `dp embed --help` to fix the lint violation while preserving the
command documentation.
In `@source/tests/pt/model/test_embedding.py`:
- Around line 260-271: The tolerance values (rtol=1e-10 and atol=1e-12) in both
np.testing.assert_allclose calls for eval_descriptor and eval_fitting_last_layer
are too strict for float32 comparisons across separate forward passes, causing
intermittent test failures. Increase both the relative tolerance (rtol) and
absolute tolerance (atol) parameters in both assert_allclose calls to more
reasonable values that account for floating-point precision variations while
still validating correctness.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: cf1c59a0-671a-4d28-8ec5-c902cadd310c
📒 Files selected for processing (30)
deepmd/entrypoints/embedding.pydeepmd/entrypoints/eval_desc.pydeepmd/entrypoints/main.pydeepmd/infer/deep_eval.pydeepmd/main.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/model/atomic_model/base_atomic_model.pydeepmd/pt/model/atomic_model/dp_atomic_model.pydeepmd/pt/model/atomic_model/linear_atomic_model.pydeepmd/pt/model/atomic_model/sezm_atomic_model.pydeepmd/pt/model/descriptor/sezm_nn/dens.pydeepmd/pt/model/model/dp_model.pydeepmd/pt/model/model/make_model.pydeepmd/pt/model/model/sezm_model.pydeepmd/pt/model/task/dipole.pydeepmd/pt/model/task/ener.pydeepmd/pt/model/task/fitting.pydeepmd/pt/model/task/invar_fitting.pydeepmd/pt/model/task/polarizability.pydeepmd/pt/model/task/sezm_ener.pydoc/inference/embedding.mddoc/inference/index.rstdoc/inference/python.mddoc/model/dpa4.mddoc/test/test.mdsource/tests/infer/test_models.pysource/tests/pt/model/test_embedding.pysource/tests/pt/model/test_linear_atomic_model.pysource/tests/pt/test_eval_desc.pysource/tests/pt/test_oom_retry.py
💤 Files with no reviewable changes (4)
- source/tests/pt/test_oom_retry.py
- deepmd/pt/model/atomic_model/sezm_atomic_model.py
- deepmd/pt/model/model/dp_model.py
- deepmd/pt/model/descriptor/sezm_nn/dens.py
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5571 +/- ##
==========================================
+ Coverage 82.14% 82.16% +0.01%
==========================================
Files 900 901 +1
Lines 104140 104297 +157
Branches 4472 4472
==========================================
+ Hits 85549 85696 +147
- Misses 17182 17190 +8
- Partials 1409 1411 +2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
njzjz-bot
left a comment
There was a problem hiding this comment.
Thanks for adding the embedding extraction path. I think this needs a few changes before merge, mainly around backend/API compatibility. The PR adds a PyTorch-only eval_embedding implementation, but the CLI/docs deprecate backend-generic descriptor APIs (dp eval-desc, DeepPot.eval_descriptor) in favor of dp embed / eval_embedding. That currently sends TensorFlow/Paddle users from a working descriptor path to an unsupported one. There is also no descriptor-only mode in the replacement command, so users who only need descriptors pay for fitting features and larger HDF5 output.
Reviewed by OpenClaw 2026.6.8 (844f405) (model: custom-chat-jinzhezeng-group/gpt-5.5).
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
deepmd/infer/deep_eval.py (1)
695-708:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the dtype wording in
eval_embedding’s docstring.Line 707 still says all embeddings are returned as float32, but Lines 737-738 now allow
"fp64"and"native".Proposed docstring adjustment
- embeddings are returned as float32, which is ample for downstream - analysis. + embeddings are returned as float32 by default, which is ample for + downstream analysis. Pass ``dtype="fp64"`` or ``dtype="native"`` to + request a different output precision.Also applies to: 737-738
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/infer/deep_eval.py` around lines 695 - 708, The docstring for the eval_embedding method on line 712-713 states that all three embeddings are returned as float32, but the implementation now supports configurable dtypes including "fp64" and "native" (as shown in the dtype parameter). Update the docstring statement that mentions float32 to reflect that the returned dtype depends on the dtype parameter passed to the function, removing the hardcoded reference to float32 and indicating that the embeddings are returned in the specified dtype format.deepmd/pt/infer/deep_eval.py (1)
1151-1164:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the float32-only wording now that
dtypeis configurable.Line 1163 says all embeddings are returned as float32, but Lines 1192-1193 now document
"fp64"and"native"as valid output dtypes.Proposed docstring adjustment
- total energy. All three embeddings are returned as float32, which is - ample for downstream analysis. + total energy. All three embeddings are returned as float32 by default, + which is ample for downstream analysis. Pass ``dtype="fp64"`` or + ``dtype="native"`` to request a different output precision.Also applies to: 1192-1193
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/infer/deep_eval.py` around lines 1151 - 1164, The docstring for the evaluate_descriptor_atomic_feature_and_structural_feature method (or similar function name around line 1151-1164) contains outdated information stating that all three embeddings are always returned as float32. Since the function now includes a configurable dtype parameter that accepts "fp32", "fp64", and "native" options (as documented in lines 1192-1193), update the docstring to reflect that the output dtype depends on the dtype parameter passed to the function, rather than claiming they are exclusively float32. Remove or modify the sentence that incorrectly claims fixed float32 output and replace it with text indicating the dtype is configurable via the dtype parameter.
♻️ Duplicate comments (1)
source/tests/pt/model/test_embedding.py (1)
264-279:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winLoosen tolerances for float32 cross-call comparisons.
The tolerances
rtol=1e-10, atol=1e-12are tighter than float32 machine epsilon (~1e-7) and can cause flaky test failures across independent forward passes. Graph context shows float32 comparisons elsewhere in the codebase usertol=1e-4.Proposed fix
np.testing.assert_allclose( dp.eval_descriptor( self.coord_np, self.cell_np, self.atype_np, dtype="fp32" ), descriptor, - rtol=1e-10, - atol=1e-12, + rtol=1e-5, + atol=1e-6, ) np.testing.assert_allclose( dp.eval_fitting_last_layer( self.coord_np, self.cell_np, self.atype_np, dtype="fp32" ), atomic_feature, - rtol=1e-10, - atol=1e-12, + rtol=1e-5, + atol=1e-6, )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/pt/model/test_embedding.py` around lines 264 - 279, The tolerances rtol=1e-10, atol=1e-12 are too tight for float32 precision and cause flaky test failures. Update both assert_allclose calls for eval_descriptor and eval_fitting_last_layer to use rtol=1e-4 which aligns with float32 comparison standards used elsewhere in the codebase. Change the rtol parameter from 1e-10 to 1e-4 and the atol parameter from 1e-12 to a value appropriate for float32 (typically around 1e-6 or 1e-5) in both assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@deepmd/infer/deep_eval.py`:
- Around line 695-708: The docstring for the eval_embedding method on line
712-713 states that all three embeddings are returned as float32, but the
implementation now supports configurable dtypes including "fp64" and "native"
(as shown in the dtype parameter). Update the docstring statement that mentions
float32 to reflect that the returned dtype depends on the dtype parameter passed
to the function, removing the hardcoded reference to float32 and indicating that
the embeddings are returned in the specified dtype format.
In `@deepmd/pt/infer/deep_eval.py`:
- Around line 1151-1164: The docstring for the
evaluate_descriptor_atomic_feature_and_structural_feature method (or similar
function name around line 1151-1164) contains outdated information stating that
all three embeddings are always returned as float32. Since the function now
includes a configurable dtype parameter that accepts "fp32", "fp64", and
"native" options (as documented in lines 1192-1193), update the docstring to
reflect that the output dtype depends on the dtype parameter passed to the
function, rather than claiming they are exclusively float32. Remove or modify
the sentence that incorrectly claims fixed float32 output and replace it with
text indicating the dtype is configurable via the dtype parameter.
---
Duplicate comments:
In `@source/tests/pt/model/test_embedding.py`:
- Around line 264-279: The tolerances rtol=1e-10, atol=1e-12 are too tight for
float32 precision and cause flaky test failures. Update both assert_allclose
calls for eval_descriptor and eval_fitting_last_layer to use rtol=1e-4 which
aligns with float32 comparison standards used elsewhere in the codebase. Change
the rtol parameter from 1e-10 to 1e-4 and the atol parameter from 1e-12 to a
value appropriate for float32 (typically around 1e-6 or 1e-5) in both
assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d11ec34e-9512-4db8-8896-57b97f9f5cc0
📒 Files selected for processing (9)
deepmd/entrypoints/embedding.pydeepmd/entrypoints/eval_desc.pydeepmd/infer/deep_eval.pydeepmd/main.pydeepmd/pt/infer/deep_eval.pydoc/inference/embedding.mddoc/inference/python.mddoc/test/test.mdsource/tests/pt/model/test_embedding.py
✅ Files skipped from review due to trivial changes (2)
- doc/test/test.md
- doc/inference/embedding.md
🚧 Files skipped from review as they are similar to previous changes (1)
- deepmd/entrypoints/embedding.py
iProzd
left a comment
There was a problem hiding this comment.
Thanks, this is a useful embedding extraction feature and the backend/docs compatibility issues look much better now.
I still think one back-compat issue should be fixed before merge. eval_descriptor() and eval_fitting_last_layer() are existing public APIs, but they now route through eval_embedding(), which requires the frozen TorchScript model to expose the newly exported forward_embedding() method. Models frozen before this PR cannot have that method, even though the old hook-based descriptor / fitting-last-layer path used to work for them.
The explicit NotImplementedError is fine for the new eval_embedding() API, but it is a regression for the existing descriptor APIs and for dp eval-desc. Could you keep a fallback to the old hook-based path when forward_embedding is unavailable, and add a regression test for a legacy-frozen / legacy-hook model path?
After that, LGTM.
|
I found one more backend-compatibility issue around the new
Please either keep This is separate from the already-requested legacy — OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5 |
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/infer/deep_eval.py`:
- Around line 611-615: Remove the dtype parameter from being passed to
backend-specific method calls (eval_descriptor() and eval_fitting_last_layer()).
Instead of adding dtype to eval_kwargs that gets forwarded to the backend,
capture the returned NumPy array from these backend calls and perform the dtype
casting at the high-level wrapper level when dtype is not "native". This ensures
the backend-generic DeepEval wrapper handles dtype conversion independently
without assuming backend support for this parameter, fixing the TensorFlow
backend compatibility issue where eval_descriptor() and
eval_fitting_last_layer() do not accept a dtype argument.
- Around line 704-708: The docstring text in the deep_eval.py file claims that
all three embeddings are returned as float32, but this contradicts the actual
method behavior which accepts dtype parameters like "fp64" and "native". Update
the documentation text to accurately reflect that embeddings are returned
according to the dtype parameter specified in the method, rather than always
being float32.
In `@deepmd/main.py`:
- Around line 495-500: The `--dtype` argument with choices fp32, fp64, and
native is exposed unconditionally without backend-aware validation, causing
TensorFlow users to encounter runtime errors instead of early CLI feedback.
After argument parsing in the eval-desc command, add a validation check that
verifies the selected dtype is compatible with the active backend. If the
backend is TensorFlow and dtype is anything other than native, raise an error
with a clear message explaining that non-native dtypes are only supported on
PyTorch. This ensures the command fails early with helpful guidance rather than
at runtime.
In `@deepmd/pt/model/atomic_model/linear_atomic_model.py`:
- Around line 311-368: The forward_embedding method currently returns the
embedding from the first sub-model that supports embeddings without verifying
that exactly one sub-model has embedding capability, which creates ambiguity
when multiple sub-models support embeddings. Add a check in the
forward_embedding method to count how many sub-models in self.models have
embeddings (by iterating through and counting calls to model.has_embedding()),
and raise a NotImplementedError or ValueError if the count is greater than one,
ensuring only the DP+ZBL contract (exactly one embedding-capable sub-model) is
accepted. Keep the existing behavior of returning from the first
embedding-capable model, but only after verifying there is exactly one.
In `@deepmd/pt/model/model/sezm_model.py`:
- Around line 879-887: The file contains 36 F722 lint errors from Ruff related
to jaxtyping annotations across multiple methods. To fix this, add `# noqa:
F722` suppression comments to all type-annotated parameter lines in the affected
methods: forward_embedding, forward, forward_common, forward_lower,
build_neighbor_list, and build_extended_neighbor_list. Apply the suppression
consistently to each line containing jaxtyping annotations (Float, Int, Tensor
types) across all six methods to ensure Ruff passes validation before CI runs.
- Around line 921-925: Add a training mode guard in the `forward_embedding()`
method to prevent incorrect embedding cache reuse. After the existing mode check
that validates `self.get_active_mode() != "ener"`, add another guard that raises
NotImplementedError when `self.training` is True, since the code comment
indicates this is an eval-only graph and the compiled_embedding cache does not
account for training mode differences like the regular cache does (which keys by
bool(self.training)). This ensures training-mode descriptor graphs are not
cached and later reused incorrectly during evaluation.
In `@source/tests/pt/model/test_embedding.py`:
- Line 7: The file uses unittest.mock.patch.dict, unittest.mock.Mock, and
unittest.mock.patch.object throughout the test file but only imports the base
unittest module without explicitly importing the mock submodule. Add an explicit
import statement for unittest.mock alongside the existing unittest import at the
top of the file to ensure the mock submodule is available when these mock
utilities are used.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 36655b57-8fe8-49cf-8dda-b08f1e539402
📒 Files selected for processing (29)
deepmd/entrypoints/embedding.pydeepmd/entrypoints/eval_desc.pydeepmd/entrypoints/main.pydeepmd/infer/deep_eval.pydeepmd/main.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/model/atomic_model/base_atomic_model.pydeepmd/pt/model/atomic_model/dp_atomic_model.pydeepmd/pt/model/atomic_model/linear_atomic_model.pydeepmd/pt/model/atomic_model/sezm_atomic_model.pydeepmd/pt/model/descriptor/sezm_nn/dens.pydeepmd/pt/model/model/dp_model.pydeepmd/pt/model/model/make_model.pydeepmd/pt/model/model/sezm_model.pydeepmd/pt/model/task/dipole.pydeepmd/pt/model/task/ener.pydeepmd/pt/model/task/fitting.pydeepmd/pt/model/task/invar_fitting.pydeepmd/pt/model/task/polarizability.pydeepmd/pt/model/task/sezm_ener.pydoc/inference/embedding.mddoc/inference/index.rstdoc/inference/python.mddoc/model/dpa4.mddoc/test/test.mdsource/tests/pt/model/test_embedding.pysource/tests/pt/model/test_linear_atomic_model.pysource/tests/pt/test_eval_desc.pysource/tests/pt/test_oom_retry.py
💤 Files with no reviewable changes (4)
- deepmd/pt/model/model/dp_model.py
- deepmd/pt/model/atomic_model/sezm_atomic_model.py
- source/tests/pt/test_oom_retry.py
- deepmd/pt/model/descriptor/sezm_nn/dens.py
✅ Files skipped from review due to trivial changes (6)
- doc/inference/index.rst
- doc/inference/python.md
- source/tests/pt/test_eval_desc.py
- doc/test/test.md
- doc/model/dpa4.md
- doc/inference/embedding.md
🚧 Files skipped from review as they are similar to previous changes (12)
- source/tests/pt/model/test_linear_atomic_model.py
- deepmd/entrypoints/main.py
- deepmd/pt/model/task/invar_fitting.py
- deepmd/entrypoints/eval_desc.py
- deepmd/pt/model/task/polarizability.py
- deepmd/pt/model/task/ener.py
- deepmd/pt/model/task/dipole.py
- deepmd/pt/model/atomic_model/base_atomic_model.py
- deepmd/pt/model/model/make_model.py
- deepmd/pt/model/task/sezm_ener.py
- deepmd/pt/infer/deep_eval.py
- deepmd/pt/model/task/fitting.py
|
Closes #5507 |
njzjz-bot
left a comment
There was a problem hiding this comment.
Thanks for the update. I re-reviewed the compatibility fixes and this looks good to me now.
The two blockers I had are addressed:
eval_descriptor()/eval_fitting_last_layer()now fall back to the legacy TorchScript hook path whenforward_embeddingis absent, so old frozen.pthmodels should keep working.dp eval-desc --dtypeis now handled by the backend-agnostic high-level wrapper, so non-PyTorch backends keep using their existingeval_descriptor()signatures and only the returned NumPy array is cast.
I also checked git diff --check, CI is green, and the touched Python files compile locally with py_compile in this checkout. I don't see remaining blockers from my side.
— OpenClaw 2026.6.8 (844f405), model: custom-chat-jinzhezeng-group/gpt-5.5
Fixed #5507
Summary by CodeRabbit
dp embedto exportdescriptor,atomic_feature, andstructural_featureinto a single HDF5 file with per-system groups.DeepPot.eval_embedding()with selectable output precision (fp32,fp64,native).--head/--model-branch).