add DPA-ADAPT toolkit for downstream property adaptation#5572
add DPA-ADAPT toolkit for downstream property adaptation#5572zhaiwenxi wants to merge 164 commits into
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
feat: add DeePMD property tools
for more information, see https://pre-commit.ci
Add property tools
dpa_tools merge
…t, unify --target-key
…utput parsing - DPAFineTuner: extract _FrozenSklearnPipeline helper; keep public API unchanged - MFTFineTuner: defer _read_fitting_net_from_ckpt to first access - DPATrainer._parse_test_output: single anchored regex per metric, auto-detect format
…perty metrics - _load_labels: accept str | list[str], stack columns for multi-property - build_sklearn_head: n_outputs param, wrap RF/Ridge with MultiOutputRegressor - evaluate: per-property mae/rmse/r2 dict when target_key is a list - freeze/DPAPredictor: store and load target_key as-is (str or list) - CLI: --target-key homo,lumo parsed via _maybe_split_list - 6 new tests covering fit, evaluate, freeze/load round-trip
The old _load_descriptor_model, _validate_type_map, _remap_atom_types, _extract_features_cached, and _extract_features method bodies were left in place alongside the new thin delegators, causing CodeQL 'variable defined multiple times' warnings. Removed the old bodies; kept _extract_features_cached on DPAFineTuner directly so that test patches on DPAFineTuner._extract_features are honoured through the cache wrapper.
… method - Replace try/except ImportError in _unwrap_multioutput with direct import (sklearn is always available when dpa_tools is loaded) - Remove _FrozenSklearnPipeline.extract_features_cached (dead code; the caching wrapper lives on DPAFineTuner so test patches work)
The workflow still referenced the deleted deepmd_property_tools/ directory. Updated paths trigger to deepmd/dpa_tools/** and test command to source/tests/dpa_tools/. Added torch to lightweight dependencies.
numpy 2.3+ requires Python>=3.11, but the property_tools_tests workflow runs on Python 3.10. Pin numpy>=1.21,<2.2 to keep the lightweight dependency install working on older Python.
refactor: unify dpa_tools CLI/API and merge deepmd_property_tools
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5572 +/- ##
==========================================
- Coverage 82.14% 82.14% -0.01%
==========================================
Files 900 901 +1
Lines 104139 104139
Branches 4471 4473 +2
==========================================
- Hits 85550 85547 -3
- Misses 17178 17181 +3
Partials 1411 1411 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
|
Linking the property-workflow issues that this PR covers under the updated DPA-ADAPT command surface:
Command-name update: this PR implements the workflow as standalone Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5) |
njzjz-bot
left a comment
There was a problem hiding this comment.
Thanks for putting this together. I think this needs another revision before merge: there are a few correctness issues in the DPA-ADAPT code path, and the example/test material should be trimmed and made portable. I left inline comments on the specific blockers.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| self._sklearn._model = self._model | ||
| if self._device is not None: | ||
| self._sklearn._device = self._device | ||
| self._sklearn._checkpoint_type_map = self._checkpoint_type_map |
There was a problem hiding this comment.
This sync overwrites the pipeline's checkpoint type_map with the parent object's initial []. _FrozenSklearnPipeline.load_descriptor_model() sets self._checkpoint_type_map from the checkpoint, but the parent DPAFineTuner._checkpoint_type_map is never updated, so the next _ensure_sklearn() call clears it again. That disables unsupported-element validation and local-to-checkpoint atom-type remapping; for non-prefix type maps, descriptors can be computed with wrong atom-type indices. Please either sync the loaded value back to the parent or avoid overwriting the pipeline value after it is loaded.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| def _per_system_cache_path(system) -> Path: | ||
| """Return the cache path for a single system's descriptors.""" | ||
| fp = _system_fingerprint(system) | ||
| return _cache_dir() / f"{fp}.npy" |
There was a problem hiding this comment.
The per-system descriptor cache key only depends on the input system fingerprint, but ensure_per_system_cache() also takes pretrained, model_branch, and pooling. A cache file generated with one checkpoint/branch/pooling will be silently reused for another, which can train/evaluate on stale descriptors. Please include the resolved checkpoint identity/mtime, branch, and pooling in the per-system key; the bulk _cache_key() above should also include model_branch.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| stderr=subprocess.STDOUT, | ||
| text=True, | ||
| bufsize=1, | ||
| cwd=self.output_dir, |
There was a problem hiding this comment.
Running dp with cwd=self.output_dir breaks the default relative output_dir. Just above, input_json is built as ./dpa_output/mft_input.json (or similar); after changing cwd into ./dpa_output, the command now looks for ./dpa_output/dpa_output/mft_input.json. Relative train/aux paths embedded in the generated config have the same issue. Please use absolute paths in the config/command or run from the original working directory.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| # ADAPT example | ||
|
|
||
| This directory contains a small ready-to-run example for `dpa_adapt`. | ||
| The example uses 50 pre-processed QM9 molecules to fine-tune and evaluate a |
There was a problem hiding this comment.
This example currently commits 50 preprocessed QM9 systems (252 files, about 1.5 MB) under examples/dpa_adapt/data. That feels too large and noisy for a repository example, especially since prepare_data.py can regenerate data. Please reduce the checked-in dataset to the minimal number of tiny systems needed to demonstrate the commands (or keep only generated-on-demand data), and leave larger QM9 regeneration to the script/docs.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| import numpy as np | ||
|
|
||
| # ── paths ────────────────────────────────────────────────────────────────── | ||
| DEMO_DIR = Path("/home/ziren/aisi-intern/deepmd-kit/examples/dpa_adapt/data") |
There was a problem hiding this comment.
This test is not portable: it hard-codes a local /home/ziren/... checkout and, below, a local pretrained checkpoint path. It will fail for anyone else running the repository tests locally and should not be merged as-is. Please move this under source/tests/dpa_adapt/ and build paths from the repository root / tmp_path, with any real checkpoint-dependent coverage skipped or mocked.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| - **implements the Deep Potential series models**, which have been successfully applied to finite and extended systems, including organic molecules, metals, semiconductors, insulators, etc. | ||
| - **implements MPI and GPU supports**, making it highly efficient for high-performance parallel and distributed computing. | ||
| - **highly modularized**, easy to adapt to different descriptors for deep learning-based potential energy models. | ||
| - **fine-tunes pre-trained DPA models through a scikit-learn-style Python API**, via [`dpa_adapt`](dpa_adapt/README.md) — construct a `DPAFineTuner`, then `fit` and `predict` to adapt a large pre-trained model to your own property dataset, with no input files to write. |
There was a problem hiding this comment.
This link points to dpa_adapt/README.md, but this PR does not add that file (the README lives under doc/dpa_adapt/README.md). As written, the top-level README will contain a broken link. Please either add the package README or link to the documentation path that actually exists.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
|
|
||
| # TODO: replace with dedicated DescriptorExtractor class after refactor. | ||
| # For now, DPAFineTuner is reused purely as a descriptor feature extractor. | ||
| self._extractor = DPAFineTuner( |
There was a problem hiding this comment.
The frozen-model predictor loads the saved type_map into self._type_map, but the descriptor extractor is constructed without that map. _extract_and_condition() validates against self._type_map, then _extract_features() uses the extractor's own empty/default type map state. For data without type_map.raw, this can compute descriptors with the wrong checkpoint atom-type indices. Please pass/sync the saved type map into the extractor before validation/extraction.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| t.train_data if isinstance(t.train_data, list) else [t.train_data] | ||
| ) | ||
|
|
||
| training = { |
There was a problem hiding this comment.
MFTFineTuner.fit(..., valid_data=...) stores valid_data, but the generated MFT config never emits a validation_data block for either branch. As a result, callers who provide validation data silently train without validation. Please either wire valid_data into the config or reject it explicitly.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| ) | ||
| # Paper default 0.5/0.5; aux_prob (default 0.5) controls the split, the | ||
| # downstream share is the complement. Legacy keeps downstream at 1.0. | ||
| downstream_prob = (1.0 - t.aux_prob) if is_property else 1.0 |
There was a problem hiding this comment.
aux_prob is not range-validated before using 1.0 - t.aux_prob. Values outside [0, 1] produce negative model sampling probabilities (for example aux_prob=1.2 gives downstream -0.2), which will fail later or train with invalid branch weights. Please validate this in the tuner constructor before building the DeepMD input.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| ) | ||
| return str(latest) | ||
|
|
||
| if self.fparam_dim > 0: |
There was a problem hiding this comment.
When fparam_dim > 0, this validates only the training systems. Validation systems can still be missing set.*/fparam.npy or have a different fparam width, so dp --pt train will fail later or validate with inconsistent feature dimensions. Please validate valid_systems with the same fparam_dim before writing/running the config.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| self._condition_manager = None | ||
| if self.fparam_dim > 0: | ||
| conditions = _read_fparam_from_systems(systems) | ||
| if conditions is not None: |
There was a problem hiding this comment.
For frozen sklearn training, requested fparams are silently ignored if _read_fparam_from_systems() returns None. If fparam_dim > 0, missing fparam data should be a hard error, not a fallback to a model without conditions. This also needs to ensure all systems have fparams with the expected width, otherwise a partial read can concatenate condition rows against the wrong descriptor rows.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| "fmt is not supported for mft evaluate(); " | ||
| "provide deepmd/npy system directories." | ||
| ) | ||
| result = self._ensure_mft().predict(data) |
There was a problem hiding this comment.
The public wrapper's MFT evaluate() always calls MFTFineTuner.predict(), but predict() explicitly rejects downstream_task_type='ener'. MFTFineTuner.evaluate() already supports the energy-mode path, so legacy energy-mode MFT evaluation is unreachable through DPAFineTuner.evaluate(). Please dispatch to MFTFineTuner.evaluate() for energy mode.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
| "freeze() was called before fit(). Train the model with fit() first." | ||
| ) | ||
|
|
||
| bundle = { |
There was a problem hiding this comment.
After frozen_head, finetune, or mft training, _fitted is set to True, so this freeze() path is allowed even though no sklearn predictor/target metadata was fit. The resulting bundle has predictor=None (and default task metadata) and can be loaded by DPAPredictor only to fail or behave nonsensically. Please restrict this freeze format to the sklearn strategy, or implement separate serialization for the other strategies.
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
njzjz
left a comment
There was a problem hiding this comment.
I tested this PR locally on 7111f678f1df3d35679a2b7f49fbe3b686ceda41 with srun --gres=gpu:1 on an RTX 5090. After installing -e .[dpa-adapt] and replacing the CPU torch wheel with torch 2.12.1+cu129, CUDA was visible from the venv (torch.cuda.is_available() == True).
What passed:
python -m pytest source/tests/dpa_adapt/ -v --ignore=source/tests/dpa_adapt/test_trainer_dim_case_embd.py: 293 passed, 12 skipped.python -m pytest source/tests/dpa_adapt/test_backend_contract.py -v: 7 passed with CUDA torch.srun --gres=gpu:1 ... python examples/dpa_adapt/scripts/run_evaluate_frozen_sklearn.py: completed, MAE 1.1801 eV, RMSE 1.4642 eV, R2 -0.5223.srun --gres=gpu:1 ... python examples/dpa_adapt/scripts/run_evaluate_frozen_head.py: completed numerically, but exposed the issue in the inline comment: the spawneddp --pt traincame from/home/jzzeng/miniconda3/bin/dpinstead of the active venv'sdp.
Requesting changes because the dp subprocess resolution can silently run a different DeePMD-kit/torch environment from the one importing dpa_adapt, so the training/evaluation paths are not reliable in common symlinked-venv setups.
| from pathlib import Path as _Path | ||
|
|
||
| exe_name = "dp.exe" if _os.name == "nt" else "dp" | ||
| candidate = _Path(_sys.executable).resolve().parent / exe_name |
There was a problem hiding this comment.
This escapes the active virtualenv when sys.executable is a symlink. In my local venv, sys.executable is /home/jzzeng/codes/deepmd-kit/venv/bin/python, but Path(sys.executable).resolve().parent becomes /home/jzzeng/miniconda3/bin, so resolve_dp_command() returns /home/jzzeng/miniconda3/bin/dp even though shutil.which('dp') points at /home/jzzeng/codes/deepmd-kit/venv/bin/dp. The frozen_head example then printed Running: /home/jzzeng/miniconda3/bin/dp --pt train ..., i.e. it trained with a different DeePMD-kit/torch install (deepmd-kit 3.2.0b1.dev42, torch 2.10.0+cu128) than the PR venv (deepmd-kit 3.2.0b1.dev203, torch 2.12.1+cu129).
Please do not dereference the interpreter symlink here. Use the scripts directory for the active environment, e.g. Path(sys.executable).parent / exe_name or sysconfig.get_path('scripts'), before falling back to shutil.which('dp').
There was a problem hiding this comment.
f"{sys.executable} -m deepmd has the same effect
iProzd
left a comment
There was a problem hiding this comment.
The main blocker is that CodeQL is currently failing with 53 new alerts, including 30 errors. Most of the errors come from names exported in __all__ in dpa_adapt/__init__.py and dpa_adapt/data/__init__.py that are not defined in the module scope. Even if this is intended as a lazy-import design, the current CI configuration rejects it, so this needs to be fixed before merge.
I also agree with the existing functional review comments. In particular:
- The descriptor cache key is not complete. The bulk cache misses
model_branch, and the per-system cache key does not include checkpoint identity/mtime, branch, or pooling. This can silently reuse stale descriptors across different models/branches/pooling settings. resolve_dp_command()should not dereferencesys.executable; usingPath(sys.executable).resolve().parentcan escape the active virtualenv and rundpfrom a different DeePMD-kit/torch installation.- The MFT path still has several correctness issues:
aux_probis not range-validated,valid_datais accepted but not emitted into the generated config, and runningdpfromcwd=self.output_dircan break relative input/config/data paths. - fparam handling should be stricter: validation systems need the same
fparam_dimchecks as training systems, andfrozen_sklearnshould raise whenfparam_dim > 0but requiredfparam.npyfiles are missing or inconsistent. - The top-level README links to
dpa_adapt/README.md, but the PR adds the docs underdoc/dpa_adapt/README.md. - The root-level
test_data_utilities.py/tests/test_dpa_tools.pyfiles and the large checked-in example dataset should be cleaned up or moved/reduced to match the repository test/example conventions.
So I would suggest fixing the CI failure and the cache/type-map/path/fparam/MFT correctness issues first, then doing another review pass. The feature is promising, but this is still too large and too fragile to merge in its current state.
There was a problem hiding this comment.
This test workflow may be a duplicate of test_python.yml.
There was a problem hiding this comment.
This file should not be put here.
📝 WalkthroughWalkthroughIntroduces ChangesDPA-ADAPT Package
Sequence Diagram(s)sequenceDiagram
participant User
participant CLI as dpa-adapt CLI
participant DPAFineTuner
participant _FrozenSklearnPipeline
participant DescCache as Descriptor Cache
participant DPATrainer
participant dp as dp --pt train/test
User->>CLI: dpa-adapt fit --strategy frozen_sklearn
CLI->>DPAFineTuner: fit(train_data, target_key)
DPAFineTuner->>_FrozenSklearnPipeline: _fit_sklearn()
_FrozenSklearnPipeline->>DescCache: load_or_extract(systems, pretrained)
DescCache-->>_FrozenSklearnPipeline: descriptors np.ndarray
_FrozenSklearnPipeline->>_FrozenSklearnPipeline: StandardScaler + sklearn head fit
DPAFineTuner-->>CLI: fitted
User->>CLI: dpa-adapt fit --strategy frozen_head
CLI->>DPAFineTuner: fit(train_data, target_key)
DPAFineTuner->>DPATrainer: _fit_training()
DPATrainer->>dp: dp --pt train (freeze_backbone=True)
dp-->>DPATrainer: model.ckpt-N.pt
DPATrainer-->>DPAFineTuner: checkpoint path
User->>CLI: dpa-adapt predict --model frozen_model.pth
CLI->>DPAFineTuner: predict(test_data)
DPAFineTuner->>DescCache: load_or_extract(systems)
DescCache-->>DPAFineTuner: descriptors
DPAFineTuner-->>CLI: DotDict(predictions)
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Note
Due to the large number of review comments, Critical severity comments were prioritized as inline comments.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
source/install/uv_with_retry.sh (1)
11-19: 🩺 Stability & Availability | 🟡 Minor | ⚡ Quick winReset stderr per attempt to avoid stale-match retries.
tmpstderris appended across retries (tee -a), so a previous retryable error can make later non-retryable failures look retryable. Clear stderr each iteration (or stop appending) before matching.Suggested fix
while true; do - uv "$@" 2> >(tee -a "${tmpstderr}" >&2) + : > "${tmpstderr}" + uv "$@" 2> >(tee "${tmpstderr}" >&2) exit_code=$?Also applies to: 27-27
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/install/uv_with_retry.sh` around lines 11 - 19, The tmpstderr file is being appended to across retry attempts using tee -a, which causes old error messages from previous failed attempts to remain in the file. When the grep check runs to determine if the error is retryable, it will match against stale errors from previous retries instead of only the current attempt's stderr. Fix this by clearing the tmpstderr file at the beginning of each retry loop iteration (before running the uv command) or by changing tee -a to tee (without the -a flag) to overwrite the file instead of appending. This ensures only the current attempt's stderr is checked when matching against retryable error patterns.
🟠 Major comments (23)
dpa_adapt/data/smiles.py-612-620 (1)
612-620: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winDirect-data ingestion silently drops samples on length mismatch.
zip(atoms, coordinates, targets)truncates to the shortest input. If one array is shorter, samples are silently discarded and records become inconsistent. Validate equal lengths before zipping (same asrecords_from_direct_data()).Proposed fix
atoms = data.get("atoms") coordinates = data.get("coordinates") targets = data.get("target", data.get("targets")) if atoms is None or coordinates is None or targets is None: raise ValueError("Direct data requires atoms, coordinates, and target") + if not (len(atoms) == len(coordinates) == len(targets)): + raise ValueError("atoms, coordinates, and target must have the same length") records = [ (list(s), np.asarray(c, dtype=np.float32), float(t), i) for i, (s, c, t) in enumerate(zip(atoms, coordinates, targets)) ]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/data/smiles.py` around lines 612 - 620, The zip(atoms, coordinates, targets) call silently truncates to the shortest sequence if these arrays have different lengths, causing data loss without error notification. Before the list comprehension that creates records, add validation to ensure atoms, coordinates, and targets all have equal length. Raise a ValueError with a descriptive message if the lengths do not match, similar to the existing validation pattern used in records_from_direct_data().dpa_adapt/data/convert.py-674-684 (1)
674-684: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winMulti-system label assignment breaks documented
(n_systems,)/(n_systems, dim)inputs.In multi-system mode, each
sub_valsis passed directly to_attach_single(). Scalars from(n_systems,)become 0-D and fail onvalues.shape[0]; vectors from(n_systems, dim)become(dim,)and fail frame-count checks for one-frame systems. Normalize each per-system value to include a frame axis before calling_attach_single().Proposed fix
- for sys_dir, sub_vals in zip(sys_dirs, values_arr): - _attach_single(sys_dir, head, sub_vals) + for sys_dir, sub_vals in zip(sys_dirs, values_arr): + sub_vals_arr = np.asarray(sub_vals) + if sub_vals_arr.ndim == 0: + sub_vals_arr = sub_vals_arr.reshape(1) + elif sub_vals_arr.ndim == 1: + sub_vals_arr = sub_vals_arr[np.newaxis, :] + _attach_single(sys_dir, head, sub_vals_arr)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/data/convert.py` around lines 674 - 684, The loop iterating through sys_dirs and values_arr with zip(sys_dirs, values_arr) extracts sub_vals that lose the frame axis dimension required by _attach_single(). When values is (n_systems,), each sub_vals becomes a scalar; when values is (n_systems, dim), each sub_vals becomes 1D shape (dim,), both causing failures. Before passing sub_vals to _attach_single(), normalize it to ensure it includes a frame axis by converting scalars to shape (1,) arrays and 1D arrays to shape (1, dim) using np.atleast_1d() or np.atleast_2d() as appropriate.dpa_adapt/data/formula.py-129-133 (1)
129-133: 📐 Maintainability & Code Quality | 🟠 Major
aseis undefined in type annotations and triggers RuffF821.The function signature at lines 129 and 133 references
ase.Atomswithout importingaseat module scope. The importfrom ase import Atoms as AseAtomsoccurs inside the function at line 158, after the annotations are processed, causing a linting failure that will block CI per the**/*.pyguidelines.Fix: Add a
TYPE_CHECKINGguard at the module level to importAtoms as AseAtoms, then replacease.AtomswithAseAtomsin the function signature.from __future__ import annotations +from typing import TYPE_CHECKING import csv import random import re from pathlib import Path import numpy as np +if TYPE_CHECKING: + from ase import Atoms as AseAtoms + # ... rest of file ... def random_doping( - base: ase.Atoms, + base: AseAtoms, fracs: dict[str, float], base_element: str, rng: random.Random, -) -> ase.Atoms: +) -> AseAtoms:🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/data/formula.py` around lines 129 - 133, The function signature uses `ase.Atoms` type annotation without importing `ase` at module scope, causing a Ruff F821 undefined name error. Add a TYPE_CHECKING import block at the module level with `from ase import Atoms as AseAtoms`, then replace both occurrences of `ase.Atoms` in the function signature (the parameter annotation for `base` and the return type annotation) with `AseAtoms` to resolve the linting error.Sources: Coding guidelines, Linters/SAST tools
dpa_adapt/finetuner.py-466-524 (1)
466-524: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winDisable descriptor hooks in a
finallyblock.Any exception after
_enable_hook()leaves the hook registered, so retries can accumulate hooks and capture stale/duplicated descriptor state. Wrap the extraction loop withtry/finallyand call_disable_hook()in thefinally.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/finetuner.py` around lines 466 - 524, The _disable_hook() call on the extractor object is not guaranteed to execute if an exception occurs during the feature extraction loop, which can leave hooks registered and cause issues on retries. Wrap the entire feature extraction loop (starting with the for system in systems loop through the all_features.append call) in a try block, and move the extractor._disable_hook() call to a finally block that executes after the try block, ensuring the hook is always disabled regardless of whether an exception occurs during processing.dpa_adapt/finetuner.py-789-790 (1)
789-790: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winKey descriptor cache entries by
model_branch.
load_descriptor_model()can select different branches from the same checkpoint, but this cache key uses only systems, checkpoint, and pooling. Fitting two branches against the same data can reuse descriptors from the first branch.Proposed direction
- key = _cache_key(systems, self.pretrained, self.pooling) + key = _cache_key( + systems, + self.pretrained, + self.pooling, + model_branch=self.model_branch, + )If
_cache_keydoes not yet acceptmodel_branch, update that helper and the corresponding cache invalidation tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/finetuner.py` around lines 789 - 790, The cache key generation in line 789 does not include the model_branch parameter, which causes different branches from the same checkpoint to incorrectly share cached descriptors. Update the _cache_key function call to include the model_branch argument alongside systems, self.pretrained, and self.pooling, then modify the _cache_key helper function definition to accept the model_branch parameter and incorporate it into the cache key computation to ensure different branches produce distinct cache entries, and update any corresponding cache invalidation tests to account for this new parameter.dpa_adapt/finetuner.py-848-852 (1)
848-852: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winDo not swallow unsupported-element validation errors.
The
except ValueErroralso catchesvalidate_type_map_subset()failures, so train data with unsupported elements can silently proceed with the checkpoint type map.Proposed fix
- try: - elements = read_data_type_map_union(systems) - validate_type_map_subset(elements, tm, label="train data") - except ValueError: - pass # no atom_names — deepmd uses raw atom indices + try: + elements = read_data_type_map_union(systems) + except ValueError: + pass # no atom_names — deepmd uses raw atom indices + else: + validate_type_map_subset(elements, tm, label="train data")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/finetuner.py` around lines 848 - 852, The broad except ValueError clause is catching validation errors from validate_type_map_subset() that should not be silently ignored. Refactor the error handling to only catch ValueError from the read_data_type_map_union() call (which legitimately may have no atom_names), and move the validate_type_map_subset() call and its error handling outside this try-except block so that validation failures for unsupported elements are properly raised and not swallowed.dpa_adapt/mft.py-449-452 (1)
449-452: 🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick winInvalidate stale frozen heads after retraining.
Line 449 returns any existing
frozen_<head>.ptheven ifmodel.ckpt-<max_steps>.ptwas regenerated later, soevaluate()/predict()can report metrics from an older checkpoint.Proposed fix
head = self._downstream_head frozen_name = f"frozen_{head}.pth" - frozen_path = os.path.join(self.output_dir, frozen_name) - if os.path.exists(frozen_path): - return frozen_path - ckpt = os.path.join(self.output_dir, f"model.ckpt-{self.max_steps}.pt") if not os.path.isfile(ckpt): raise RuntimeError( f"Expected checkpoint {ckpt} not found; cannot freeze. " f"Did fit() complete successfully?" ) + frozen_path = os.path.join(self.output_dir, frozen_name) + if ( + os.path.exists(frozen_path) + and os.path.getmtime(frozen_path) >= os.path.getmtime(ckpt) + ): + return frozen_path🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/mft.py` around lines 449 - 452, The code returns an existing frozen head file without validating it corresponds to the current checkpoint, allowing stale frozen heads from previous training runs to be used by evaluate() and predict(). Add validation logic before the return statement that checks if the current checkpoint file (model.ckpt-{self.max_steps}.pt) exists and is newer than the frozen path by comparing modification timestamps, or delete the stale frozen path if a newer checkpoint has been generated, ensuring that only frozen heads corresponding to the latest checkpoint are returned..github/workflows/dpa_adapt_tests.yml-20-24 (1)
20-24: 🔒 Security & Privacy | 🟠 MajorPin action refs to commit SHAs and disable checkout credential persistence.
Lines 20 and 23 use tag-based action refs (
actions/checkout@v4andactions/setup-python@v5), which can be reassigned. Additionally, checkout does not disable credential persistence by default. Pin both actions to specific commit SHAs and addpersist-credentials: falseto the checkout step to mitigate supply-chain and credential exposure risks.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/workflows/dpa_adapt_tests.yml around lines 20 - 24, Replace the tag-based action references with pinned commit SHAs for security. Change the uses line for actions/checkout@v4 to use its full commit SHA (typically a 40-character hash) instead of the tag, and do the same for actions/setup-python@v5. Additionally, add a new line under the checkout action with persist-credentials: false to disable credential persistence and mitigate credential exposure risks.Source: Linters/SAST tools
pyproject.toml-306-306 (1)
306-306: 🔒 Security & Privacy | 🟠 MajorUse HTTPS for CUDA repo bootstrap in CI install path.
Line 306 uses HTTP to configure the CUDA package repository, which creates a supply-chain integrity risk by enabling MITM tampering during privileged package installation. The HTTPS endpoint is reachable and functional.
Suggested change
- """{ if [ -n "${CUDA_VERSION}" ] && [ "$(uname -m)" = "x86_64" ] ; then yum config-manager --add-repo http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo && yum install -y cuda-nvcc-${CUDA_VERSION/./-} cuda-cudart-devel-${CUDA_VERSION/./-}; fi }""", + """{ if [ -n "${CUDA_VERSION}" ] && [ "$(uname -m)" = "x86_64" ] ; then yum config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo && yum install -y cuda-nvcc-${CUDA_VERSION/./-} cuda-cudart-devel-${CUDA_VERSION/./-}; fi }""",🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pyproject.toml` at line 306, The CUDA repository configuration URL in the bootstrap command uses HTTP protocol instead of HTTPS, creating a supply-chain security risk. In the yum config-manager command within the string, change the URL from http://developer.download.nvidia.com/compute/cuda/repos/rhel8/x86_64/cuda-rhel8.repo to use https instead of http to ensure secure and encrypted communication when downloading the CUDA repository configuration during package installation.examples/dpa_adapt/scripts/prepare_data.py-80-88 (1)
80-88: 🔒 Security & Privacy | 🟠 MajorAvoid extracting tar members by original archive path.
At line 88,
tar.extract(member, path=str(RAW_DIR))uses untrusted member paths. The basename check on lines 82–83 is insufficient; a crafted entry (e.g.,../../gdb9.sdf) would pass the filter but escapeRAW_DIRduring extraction.Use
tar.extractfile()with an explicitly constructed destination instead:Suggested fix
- with tarfile.open(TAR_PATH, "r:gz") as tar: - for member in tar.getmembers(): - name = Path(member.name).name - if name in ("gdb9.sdf", "gdb9.sdf.csv"): - if not (RAW_DIR / name).exists() or force: - print( - f" Extracting {name} ({member.size / 1024 / 1024:.1f} MB) ..." - ) - tar.extract(member, path=str(RAW_DIR)) + allowed = {"gdb9.sdf", "gdb9.sdf.csv"} + with tarfile.open(TAR_PATH, "r:gz") as tar: + for member in tar.getmembers(): + name = Path(member.name).name + if name not in allowed or not member.isfile(): + continue + target = RAW_DIR / name + if target.exists() and not force: + continue + print(f" Extracting {name} ({member.size / 1024 / 1024:.1f} MB) ...") + extracted = tar.extractfile(member) + if extracted is None: + raise RuntimeError(f"Failed to extract {name} from {TAR_PATH}") + target.write_bytes(extracted.read())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@examples/dpa_adapt/scripts/prepare_data.py` around lines 80 - 88, The current tar.extract() call at line 88 is vulnerable to path traversal attacks because it uses the full member.name path even though only the basename is validated. Replace the tar.extract(member, path=str(RAW_DIR)) call with tar.extractfile(member) to read the file content, and explicitly construct the destination file path using only the safe basename (the name variable) rather than the untrusted member path. Write the extracted content to the destination file at RAW_DIR / name to ensure files are extracted only to the intended directory regardless of directory traversal sequences in the archive.test_data_utilities.py-12-17 (1)
12-17: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winThis test can bypass the PR code under test.
Line 16 and Lines 61-65 force
site-packagesahead of the repository checkout, so this file may validate an installed package instead of the current branch changes.💡 Suggested fix
-# Ensure the *installed* deepmd-kit (with C extensions) is used instead of -# the source checkout when running from the project root. -_site_pkg = [p for p in sys.path if "site-packages" in p] -_other = [p for p in sys.path if "site-packages" not in p] -sys.path = _site_pkg + _other +# Keep repository checkout precedence so this test validates current branch code.code = ( - "import sys; " - "_sp = [p for p in sys.path if 'site-packages' in p]; " - "_ot = [p for p in sys.path if 'site-packages' not in p]; " - "sys.path = _sp + _ot; " - "from dpa_adapt.cli import main; " + "import sys; " + "from dpa_adapt.cli import main; " "sys.argv[:] = ['dpaad'] + " + repr(args) + "; " "main()" )Also applies to: 61-65
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test_data_utilities.py` around lines 12 - 17, The sys.path manipulation in the code is prioritizing site-packages ahead of the current repository, causing tests to validate the installed package instead of the PR changes. Fix this by reversing the order of path concatenation in the sys.path assignment - place _other (repository paths) before _site_pkg so that the local code under test takes precedence. Apply this same fix to both the affected sections mentioned in the comment at lines 12-17 and lines 61-65.test_data_utilities.py-143-164 (1)
143-164: 🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy liftThis test mutates checked-in demo data in-place.
These lines write labels directly into
examples/dpa_adapt/data/.... That pollutes shared fixtures and can make subsequent tests order-dependent.💡 Suggested fix
+import shutil + DEMO_DIR = REPO_DIR / "examples" / "dpa_adapt" / "data" -TRAIN_DIR = DEMO_DIR / "train" -TEST_DIR = DEMO_DIR / "test" +WORK_DEMO_DIR = Path(tempfile.mkdtemp(prefix="dpa_adapt_demo_")) +shutil.copytree(DEMO_DIR, WORK_DEMO_DIR / "data") +TRAIN_DIR = WORK_DEMO_DIR / "data" / "train" +TEST_DIR = WORK_DEMO_DIR / "data" / "test"Also applies to: 278-280, 312-314, 345-345
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test_data_utilities.py` around lines 143 - 164, The test is mutating the shared demo data in-place by writing directly to TRAIN_DIR, which pollutes fixtures and creates test order dependencies. Replace all usages of TRAIN_DIR in the test_attach_labels and related test sections with a temporary directory fixture that isolates test writes from the checked-in demo data. Copy the necessary demo data from the original location into the temporary directory if the test requires pre-existing data, then perform all write operations against the temporary directory instead. This change should be applied consistently across all affected test sections including the attach_labels, cli_attach_labels, and cli_convert sections.source/tests/dpa_adapt/test_backend_contract.py-133-144 (1)
133-144: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winDon’t mask contract regressions as environment skips.
Line 142 catches
Exceptionand always callspytest.skip(). That can hide a realbuild_model_from_config()regression as a skip instead of a test failure.💡 Suggested fix
`@pytest.fixture`(autouse=True) def _require_deepmd(self): """Skip if the deepmd model builder is not usable.""" try: from dpa_adapt._backend import ( build_model_from_config, ) build_model_from_config(_MINIMAL_DPA3_CONFIG) - except Exception as exc: + except (ImportError, ModuleNotFoundError, OSError) as exc: pytest.skip(f"deepmd build_model_from_config not functional: {exc}") + except Exception: + raise🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_backend_contract.py` around lines 133 - 144, The _require_deepmd fixture currently catches all exceptions from both the import and the build_model_from_config call, converting them to skips. This masks real regressions in build_model_from_config() as environment skips. Modify the exception handling to only catch ImportError from the import statement within a separate try-except block, and allow exceptions from the build_model_from_config() call to propagate as test failures rather than skips. This way, only missing dependencies will result in skipped tests, while actual functionality regressions will cause test failures.source/tests/dpa_adapt/test_fparam.py-145-183 (1)
145-183: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winPatch
DPATrainerwhere it is used (dpa_adapt.finetuner), not only where defined.These tests become import-order dependent if
dpa_adapt.finetuneris already loaded. In that case,DPAFineTunerkeeps its already-boundDPATrainerreference and bypasses this patch.Suggested fix
-def test_finetuner_fparam_forwarded_to_trainer(): +def test_finetuner_fparam_forwarded_to_trainer(): """DPAFineTuner(fparam_dim=4, strategy='finetune') passes fparam_dim=4 to DPATrainer.""" - with patch("dpa_adapt.trainer.DPATrainer") as mock_trainer_cls: + with patch("dpa_adapt.finetuner.DPATrainer") as mock_trainer_cls: from dpa_adapt.finetuner import ( DPAFineTuner, ) @@ def test_finetuner_fparam_zero_not_forwarded(): """DPAFineTuner(fparam_dim=0) passes fparam_dim=0 (default, disabled).""" - with patch("dpa_adapt.trainer.DPATrainer") as mock_trainer_cls: + with patch("dpa_adapt.finetuner.DPATrainer") as mock_trainer_cls: from dpa_adapt.finetuner import ( DPAFineTuner, )🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_fparam.py` around lines 145 - 183, The patch target in both test_finetuner_fparam_forwarded_to_trainer and test_finetuner_fparam_zero_not_forwarded functions is incorrect. Currently, the patch is applied to dpa_adapt.trainer.DPATrainer, but since DPAFineTuner has already imported DPATrainer into the dpa_adapt.finetuner module, the patch does not affect the actual usage. Change the patch target from "dpa_adapt.trainer.DPATrainer" to "dpa_adapt.finetuner.DPATrainer" in both test functions to ensure the mock is applied where DPATrainer is actually used, eliminating import-order dependency.tests/test_dpa_tools.py-19-19 (1)
19-19: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick winReplace ambiguous
×characters in docstrings.Ruff flags these as
RUF002; use plainxto avoid lint failures.As per coding guidelines,
**/*.py: Install linter and runruff check .before committing changes or the CI will fail.Also applies to: 68-68
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_dpa_tools.py` at line 19, Replace the multiplication symbol `×` with a plain `x` character in the docstring that reads "Write a minimal 2×2×1 NiO₂H₂ slab POSCAR (~12 atoms)." to use "2x2x1" instead. This same issue also appears in another docstring on line 68, so apply the same fix there as well. These Unicode characters trigger Ruff linter warnings (RUF002), so using plain ASCII characters will resolve the linting failures.Sources: Coding guidelines, Linters/SAST tools
source/tests/dpa_adapt/test_validate.py-183-188 (1)
183-188: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winFix broken self-import path in
test_list_input_aggregates_across_systems.Importing
_make_set_dirfromtests.dpa_adapt.test_validateis brittle/incorrect for this file location and can fail withModuleNotFoundError. Use the local helper directly.Suggested fix
- from dpa_adapt.data.loader import ( - load_data, - ) - from tests.dpa_adapt.test_validate import ( - _make_set_dir, - ) - _make_set_dir(s2_root / "set.000") s2 = load_data(str(s2_root))[0]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_validate.py` around lines 183 - 188, The import statement is attempting to import _make_set_dir from tests.dpa_adapt.test_validate, which is a self-import since this code is already in the test_validate.py file. This brittle pattern can cause ModuleNotFoundError. Remove the import statement from tests.dpa_adapt.test_validate that imports _make_set_dir, and use the _make_set_dir helper directly in the code since it is already defined locally in the same file.tests/test_dpa_tools.py-55-55 (1)
55-55: 📐 Maintainability & Code Quality | 🟠 MajorAdd explicit
strict=tozip()to pass ruff linter checks.The
zip(formulas, values)call at line 55 violates B905 (requires explicitstrict=parameter). Since both sequences have matching lengths (3 items each), usestrict=Trueto make fixture mismatches fail fast.Suggested fix
- for f, v in zip(formulas, values): + for f, v in zip(formulas, values, strict=True):🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_dpa_tools.py` at line 55, The zip() call with formulas and values parameters is missing the explicit strict= parameter required by the ruff linter (B905 check). Add strict=True to the zip(formulas, values) call since both sequences have matching lengths and you want to ensure they remain aligned, causing the test to fail fast if there are any length mismatches in the future.Sources: Coding guidelines, Linters/SAST tools
source/tests/dpa_adapt/test_split_cv.py-31-31 (1)
31-31: 📐 Maintainability & Code Quality | 🟠 MajorMake
elementsexplicitly optional in the helper signature.This annotation violates
RUF013(PEP 484 prohibits implicitOptional).Suggested fix
- elements: list[str] = None, + elements: list[str] | None = None,As per coding guidelines,
**/*.pyfiles must passruff check .before committing changes or CI will fail.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_split_cv.py` at line 31, The parameter `elements` in the helper function signature has a type annotation of `list[str]` with a default value of `None`, which violates PEP 484 by implicitly making it Optional without explicitly declaring it. Change the type annotation for the `elements` parameter to explicitly indicate it can be None by using either `list[str] | None` or `Optional[list[str]]` syntax, ensuring it matches the default value of `None` and passes ruff linting checks.Sources: Coding guidelines, Linters/SAST tools
source/tests/dpa_adapt/test_dataset.py-28-28 (1)
28-28: 📐 Maintainability & Code Quality | 🟠 MajorUse explicit optional typing for
elementsparameter.Line 28 triggers RUF013 (PEP 484 prohibits implicit
Optional). Changeelements: list[str] = Noneto use the union syntax withNone.Suggested fix
- elements: list[str] = None, + elements: list[str] | None = None,This is a CI-blocking lint violation under
ruff checkper the coding guidelines.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_dataset.py` at line 28, The elements parameter uses implicit Optional typing by setting a default value of None without explicit union syntax, which violates PEP 484 and triggers RUF013. Change the type annotation for the elements parameter from list[str] = None to use explicit union syntax with None (i.e., list[str] | None = None). This makes the Optional type explicit in the type hint rather than relying on the default value to infer it.Sources: Coding guidelines, Linters/SAST tools
dpa_adapt/cv.py-162-162 (1)
162-162: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick winRuff violations in this file can block CI (
B905,RUF001).
zip()calls are missing explicitstrict=, and the runtime string uses the ambiguous×character. These are already flagged by static analysis.Suggested fix
- for system, grp in zip(systems, groups): + for system, grp in zip(systems, groups, strict=True): @@ - train = [s for s, g in zip(systems, grp) if g in train_formulas] - valid = [s for s, g in zip(systems, grp) if g in valid_formulas] - test = [s for s, g in zip(systems, grp) if g in test_formulas] + train = [s for s, g in zip(systems, grp, strict=True) if g in train_formulas] + valid = [s for s, g in zip(systems, grp, strict=True) if g in valid_formulas] + test = [s for s, g in zip(systems, grp, strict=True) if g in test_formulas] @@ - train = [s for s, g in zip(systems, groups) if g in train_groups] - valid = [s for s, g in zip(systems, groups) if g in valid_groups] - test = [s for s, g in zip(systems, groups) if g in test_groups] + train = [s for s, g in zip(systems, groups, strict=True) if g in train_groups] + valid = [s for s, g in zip(systems, groups, strict=True) if g in valid_groups] + test = [s for s, g in zip(systems, groups, strict=True) if g in test_groups] @@ - return f"{n_splits} × {per_run}" + return f"{n_splits} x {per_run}"As per coding guidelines,
**/*.pymust passruff check .before commit or CI fails.Also applies to: 237-239, 278-280, 585-585
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/cv.py` at line 162, Fix Ruff violations B905 and RUF001 in the file. For the B905 violation, add the explicit parameter `strict=True` to all `zip()` calls throughout the file, including the `zip(systems, grp)` call shown in the diff and the other occurrences at lines 237-239, 278-280, and 585. For the RUF001 violation, locate and replace the ambiguous `×` character used in any runtime strings with a standard ASCII alternative like `x` or a more appropriate character. Ensure all changes pass `ruff check .` before committing.Sources: Coding guidelines, Linters/SAST tools
dpa_adapt/cv.py-388-391 (1)
388-391: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winReject
cvvalues larger than the number of independent groups.When
cv > n_groups,fold_sizebecomes zero and fold assignments become degenerate, so folds can be silently skipped and metrics come back empty instead of failing fast.Suggested fix
if cv == "holdout": n_splits = 1 elif isinstance(cv, int) and cv >= 2: n_splits = cv else: raise ValueError(f"cv must be 'holdout' or an int >= 2; got {cv!r}.") + + if n_splits > n_groups: + raise ValueError( + f"cv={n_splits} exceeds number of independent groups ({n_groups})." + )Also applies to: 453-459
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/cv.py` around lines 388 - 391, The cv parameter validation needs to reject values larger than the number of independent groups to prevent degenerate fold assignments that silently skip folds. In the cv validation block where you check isinstance(cv, int) and cv >= 2, add an additional check after confirming cv is a valid integer to verify that cv does not exceed n_groups. Raise a ValueError with a descriptive message if cv is greater than n_groups. This validation should be applied consistently across all locations where cv is validated, as indicated by the reference to lines 453-459 in addition to the current location.dpa_adapt/predictor.py-241-244 (1)
241-244: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winGuard committee prediction paths when
fit()has not been called yet.With
n_committee > 1, both prediction paths readself.estimators_unconditionally. If users callpredict()beforefit(), this crashes withAttributeErrorinstead of a clear usage error.Suggested fix
class DPAPredictor: + def _require_committee_fitted(self) -> None: + if self.n_committee > 1 and not hasattr(self, "estimators_"): + raise RuntimeError( + "n_committee > 1 requires calling fit(...) first " + "to train committee members." + ) + def predict(self, data, fmt=None, return_uncertainty=False) -> DotDict: @@ if self.n_committee > 1: + self._require_committee_fitted() preds = np.array([e.predict(features) for e in self.estimators_]) preds = preds.reshape(self.n_committee, -1, self._task_dim) return DotDict({"predictions": np.mean(preds, axis=0)}) @@ def _predict_with_uncertainty(self, features): @@ if self.n_committee > 1: + self._require_committee_fitted() preds = np.array([e.predict(features) for e in self.estimators_]) preds = preds.reshape(self.n_committee, -1, self._task_dim) return DotDict(Also applies to: 277-284
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/predictor.py` around lines 241 - 244, The prediction code unconditionally accesses self.estimators_ when n_committee > 1, causing an AttributeError if predict() is called before fit(). Add a guard check before accessing self.estimators_ to verify the model has been fitted, raising a clear usage error (such as NotFittedError or a descriptive ValueError) instead of letting the AttributeError propagate. Apply this check to both the committee prediction path shown in the diff and the other prediction path also mentioned in the comment (lines 277-284).dpa_adapt/cli.py-240-242 (1)
240-242: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
evaluateCLI output breaks for multi-property models.
DPAPredictor.evaluate()returns dict metrics for multi-property targets, but this handler always formats metrics as scalars (:.6f), which raisesTypeError.Suggested fix
- print(f"MAE : {metrics.mae:.6f}") - print(f"RMSE : {metrics.rmse:.6f}") - print(f"R² : {metrics.r2:.6f}") + if isinstance(metrics.mae, dict): + for key in metrics.mae: + print( + f"{key}: MAE={metrics.mae[key]:.6f} " + f"RMSE={metrics.rmse[key]:.6f} R²={metrics.r2[key]:.6f}" + ) + else: + print(f"MAE : {metrics.mae:.6f}") + print(f"RMSE : {metrics.rmse:.6f}") + print(f"R² : {metrics.r2:.6f}")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@dpa_adapt/cli.py` around lines 240 - 242, The print statements formatting mae, rmse, and r2 metrics as scalars with the .6f format specifier will fail when DPAPredictor.evaluate() returns dictionary metrics for multi-property targets. Add type checking before the print statements to determine if metrics is a dict or scalar, and handle both cases: for scalar metrics use the current formatting approach, and for dict metrics iterate through each property and print the metrics for each property separately with appropriate formatting.
🧹 Nitpick comments (1)
source/tests/dpa_adapt/test_finetuner_strategies.py (1)
120-139: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winRemove the duplicated
_make_system_dirsdefinition.Line [120] redefines
_make_system_dirswith the same implementation as Line [98], which shadows the first function and adds avoidable maintenance risk.♻️ Suggested cleanup
-def _make_system_dirs(tmp_path, formulas=("CompA", "CompB"), n=3): - """Create minimal system dirs with type_map.raw, set.000/coord.npy, - and set.000/overpotential.npy. - """ - import numpy as np - - systems = [] - for formula in formulas: - for i in range(n): - sysdir = tmp_path / formula / str(i) - sysdir.mkdir(parents=True) - (sysdir / "type_map.raw").write_text("H\nO\n") - (sysdir / "type.raw").write_text("0\n1\n") - sdir = sysdir / "set.000" - sdir.mkdir() - np.save(sdir / "coord.npy", np.zeros((2, 6))) - np.save(sdir / "box.npy", np.tile(np.eye(3).ravel(), (2, 1))) - np.save(sdir / "overpotential.npy", np.ones((2, 1))) - systems.append(str(sysdir)) - return systems🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/dpa_adapt/test_finetuner_strategies.py` around lines 120 - 139, The function _make_system_dirs is defined twice in the file with identical implementations, creating code duplication and maintenance risk. Remove the second definition of _make_system_dirs (the one shown in the diff) and keep only the first occurrence to eliminate the shadowing and ensure a single source of truth for this helper function.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 84b92280-8e86-4fad-bfa1-49b89168895a
📒 Files selected for processing (107)
.github/workflows/dpa_adapt_tests.yml.github/workflows/test_python.yml.gitignoreREADME.mddeepmd/__about__.pydeepmd/entrypoints/test.pydoc/dpa_adapt/README.mddoc/dpa_adapt/input_formats.mddoc/index.rstdpa_adapt/__init__.pydpa_adapt/_backend.pydpa_adapt/cli.pydpa_adapt/conditions.pydpa_adapt/config/__init__.pydpa_adapt/config/manager.pydpa_adapt/cv.pydpa_adapt/data/__init__.pydpa_adapt/data/convert.pydpa_adapt/data/dataset.pydpa_adapt/data/desc_cache.pydpa_adapt/data/errors.pydpa_adapt/data/formula.pydpa_adapt/data/loader.pydpa_adapt/data/smiles.pydpa_adapt/data/type_map.pydpa_adapt/data/validate.pydpa_adapt/finetuner.pydpa_adapt/main.pydpa_adapt/mft.pydpa_adapt/predictor.pydpa_adapt/trainer.pydpa_adapt/utils/__init__.pydpa_adapt/utils/dotdict.pydpa_adapt/utils/sklearn_heads.pyexamples/dpa_adapt/README.mdexamples/dpa_adapt/data/test/sys_0000/set.000/box.npyexamples/dpa_adapt/data/test/sys_0000/set.000/coord.npyexamples/dpa_adapt/data/test/sys_0000/set.000/gap.npyexamples/dpa_adapt/data/test/sys_0000/type.rawexamples/dpa_adapt/data/test/sys_0000/type_map.rawexamples/dpa_adapt/data/test/sys_0001/set.000/box.npyexamples/dpa_adapt/data/test/sys_0001/set.000/coord.npyexamples/dpa_adapt/data/test/sys_0001/set.000/gap.npyexamples/dpa_adapt/data/test/sys_0001/type.rawexamples/dpa_adapt/data/test/sys_0001/type_map.rawexamples/dpa_adapt/data/test/sys_0002/set.000/box.npyexamples/dpa_adapt/data/test/sys_0002/set.000/coord.npyexamples/dpa_adapt/data/test/sys_0002/set.000/gap.npyexamples/dpa_adapt/data/test/sys_0002/type.rawexamples/dpa_adapt/data/test/sys_0002/type_map.rawexamples/dpa_adapt/data/test_labels.npyexamples/dpa_adapt/data/train/sys_0000/set.000/box.npyexamples/dpa_adapt/data/train/sys_0000/set.000/coord.npyexamples/dpa_adapt/data/train/sys_0000/set.000/gap.npyexamples/dpa_adapt/data/train/sys_0000/type.rawexamples/dpa_adapt/data/train/sys_0000/type_map.rawexamples/dpa_adapt/data/train/sys_0001/set.000/box.npyexamples/dpa_adapt/data/train/sys_0001/set.000/coord.npyexamples/dpa_adapt/data/train/sys_0001/set.000/gap.npyexamples/dpa_adapt/data/train/sys_0001/type.rawexamples/dpa_adapt/data/train/sys_0001/type_map.rawexamples/dpa_adapt/data/train/sys_0002/set.000/box.npyexamples/dpa_adapt/data/train/sys_0002/set.000/coord.npyexamples/dpa_adapt/data/train/sys_0002/set.000/gap.npyexamples/dpa_adapt/data/train/sys_0002/type.rawexamples/dpa_adapt/data/train/sys_0002/type_map.rawexamples/dpa_adapt/data/train/sys_0003/set.000/box.npyexamples/dpa_adapt/data/train/sys_0003/set.000/coord.npyexamples/dpa_adapt/data/train/sys_0003/set.000/gap.npyexamples/dpa_adapt/data/train/sys_0003/type.rawexamples/dpa_adapt/data/train/sys_0003/type_map.rawexamples/dpa_adapt/data/train/sys_0004/set.000/box.npyexamples/dpa_adapt/data/train/sys_0004/set.000/coord.npyexamples/dpa_adapt/data/train/sys_0004/set.000/gap.npyexamples/dpa_adapt/data/train/sys_0004/type.rawexamples/dpa_adapt/data/train/sys_0004/type_map.rawexamples/dpa_adapt/data/train_labels.npyexamples/dpa_adapt/scripts/prepare_data.pyexamples/dpa_adapt/scripts/run_evaluate_frozen_head.pyexamples/dpa_adapt/scripts/run_evaluate_frozen_sklearn.pypyproject.tomlsource/install/uv_with_retry.shsource/tests/dpa_adapt/__init__.pysource/tests/dpa_adapt/test_auto_convert.pysource/tests/dpa_adapt/test_backend_contract.pysource/tests/dpa_adapt/test_cache.pysource/tests/dpa_adapt/test_cli_smoke.pysource/tests/dpa_adapt/test_conditions.pysource/tests/dpa_adapt/test_config_merge.pysource/tests/dpa_adapt/test_convert.pysource/tests/dpa_adapt/test_dataset.pysource/tests/dpa_adapt/test_finetuner_strategies.pysource/tests/dpa_adapt/test_fparam.pysource/tests/dpa_adapt/test_loader.pysource/tests/dpa_adapt/test_mft_config.pysource/tests/dpa_adapt/test_mft_evaluate.pysource/tests/dpa_adapt/test_mft_property_task.pysource/tests/dpa_adapt/test_paper_alignment.pysource/tests/dpa_adapt/test_predictor.pysource/tests/dpa_adapt/test_smiles_data.pysource/tests/dpa_adapt/test_split_cv.pysource/tests/dpa_adapt/test_trainer.pysource/tests/dpa_adapt/test_trainer_dim_case_embd.pysource/tests/dpa_adapt/test_type_map.pysource/tests/dpa_adapt/test_validate.pytest_data_utilities.pytests/test_dpa_tools.py
| train = _maybe_split_list(args.train_data) or [args.train_data] | ||
| valid = _maybe_split_list(args.valid_data) if args.valid_data else None | ||
| type_map = _maybe_split_list(args.type_map) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🔴 Critical
🧩 Analysis chain
🏁 Script executed:
cd /tmp && find . -name "cli.py" -type f 2>/dev/null | head -5Repository: deepmodeling/deepmd-kit
Length of output: 161
🏁 Script executed:
find . -name "cli.py" -type fRepository: deepmodeling/deepmd-kit
Length of output: 180
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '95,110p'Repository: deepmodeling/deepmd-kit
Length of output: 846
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '140,150p'Repository: deepmodeling/deepmd-kit
Length of output: 586
🏁 Script executed:
ast-grep outline dpa_adapt/cli.py --view compactRepository: deepmodeling/deepmd-kit
Length of output: 312
🏁 Script executed:
ast-grep outline dpa_adapt/cli.py --view expandedRepository: deepmodeling/deepmd-kit
Length of output: 1073
🏁 Script executed:
grep -n "_maybe_split_list" dpa_adapt/cli.pyRepository: deepmodeling/deepmd-kit
Length of output: 687
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '1,50p'Repository: deepmodeling/deepmd-kit
Length of output: 1792
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '78,84p'Repository: deepmodeling/deepmd-kit
Length of output: 436
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '354,450p'Repository: deepmodeling/deepmd-kit
Length of output: 4203
🏁 Script executed:
cat -n dpa_adapt/cli.py | sed -n '450,550p'Repository: deepmodeling/deepmd-kit
Length of output: 4584
fit command crashes because list args are passed to _maybe_split_list() expecting strings.
--train-data, --valid-data, and --aux-data use nargs="+", so argparse returns list[str]. However, the code passes these directly to _maybe_split_list(), which calls .split() only on string inputs, causing AttributeError.
Suggested fix
- train = _maybe_split_list(args.train_data) or [args.train_data]
- valid = _maybe_split_list(args.valid_data) if args.valid_data else None
+ train = list(args.train_data)
+ valid = list(args.valid_data) if args.valid_data else None
@@
- aux_data = (
- _maybe_split_list(args.aux_data) or [args.aux_data] if args.aux_data else None
- )
+ aux_data = list(args.aux_data) if args.aux_data else NoneAlso applies to: 145-146
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@dpa_adapt/cli.py` around lines 101 - 103, The fit command is crashing because
arguments from argparse with nargs="+" return lists, but are passed directly to
_maybe_split_list() which expects strings and tries to call .split() on them.
Fix this by checking if the input to _maybe_split_list() is already a list
(which happens when argparse returns list[str] from nargs="+"), and if so,
return it as-is without calling .split(). Otherwise, split the string as normal.
Apply this fix to the _maybe_split_list() function so it handles both string and
list inputs gracefully, which will resolve the AttributeError for
args.train_data, args.valid_data, and args.type_map at lines 101-103, and also
at lines 145-146.
| log_level = _get_ll(parsed_args.log_level) | ||
| _set_log_handles(log_level, parsed_args.log_path) | ||
|
|
||
| if parsed_args.command is None: | ||
| parser.print_help() | ||
| return | ||
|
|
There was a problem hiding this comment.
🩺 Stability & Availability | 🔴 Critical
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the file structure and locate the relevant code
fd -t f "cli.py" dpa_adapt/Repository: deepmodeling/deepmd-kit
Length of output: 178
🏁 Script executed:
# Now let's examine the main() function and the argument parser setup
head -750 dpa_adapt/cli.py | tail -100Repository: deepmodeling/deepmd-kit
Length of output: 3470
🏁 Script executed:
# Let's also search for the parser and subparser definitions to understand the argument structure
rg "add_parser|add_argument|log_level|log_path" dpa_adapt/cli.py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 11038
🏁 Script executed:
# Let's look at the get_parser function more closely to see how parser_log is used
rg "parser_log|subparsers|add_parser" dpa_adapt/cli.py -B 1 -A 1 | head -100Repository: deepmodeling/deepmd-kit
Length of output: 2586
🏁 Script executed:
# Let's get a clearer picture - check the main parser setup and how parents are used
sed -n '600,750p' dpa_adapt/cli.pyRepository: deepmodeling/deepmd-kit
Length of output: 4741
🏁 Script executed:
# Test if log_level and log_path are actually set on the main parser when no subcommand
python3 << 'PY'
import sys
sys.path.insert(0, '.')
from dpa_adapt.cli import get_parser
parser = get_parser()
# Try parsing with no subcommand
parsed = parser.parse_args([])
# Check if log_level and log_path exist
print(f"Has log_level: {hasattr(parsed, 'log_level')}")
print(f"Has log_path: {hasattr(parsed, 'log_path')}")
print(f"Has command: {hasattr(parsed, 'command')}")
if hasattr(parsed, 'command'):
print(f"command value: {parsed.command}")
# Print all attributes
print(f"All attributes: {vars(parsed)}")
PYRepository: deepmodeling/deepmd-kit
Length of output: 274
Move logging setup after checking for missing subcommand.
When dpa-adapt is invoked without a subcommand, parsed_args only contains command=None and lacks the log_level and log_path attributes. Attempting to access these attributes before the command is None check causes an AttributeError, preventing the help message from being displayed.
Suggested fix
- # Set up logging
- log_level = _get_ll(parsed_args.log_level)
- _set_log_handles(log_level, parsed_args.log_path)
-
if parsed_args.command is None:
parser.print_help()
return
+
+ # Set up logging (subcommand parsers provide these options)
+ log_level = _get_ll(getattr(parsed_args, "log_level", "INFO"))
+ _set_log_handles(log_level, getattr(parsed_args, "log_path", None))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| log_level = _get_ll(parsed_args.log_level) | |
| _set_log_handles(log_level, parsed_args.log_path) | |
| if parsed_args.command is None: | |
| parser.print_help() | |
| return | |
| if parsed_args.command is None: | |
| parser.print_help() | |
| return | |
| # Set up logging (subcommand parsers provide these options) | |
| log_level = _get_ll(getattr(parsed_args, "log_level", "INFO")) | |
| _set_log_handles(log_level, getattr(parsed_args, "log_path", None)) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@dpa_adapt/cli.py` around lines 697 - 703, The logging setup using _get_ll()
and _set_log_handles() is being called before checking if parsed_args.command is
None, which causes an AttributeError when no subcommand is provided since
parsed_args lacks log_level and log_path attributes in that case. Move the
logging setup calls (_get_ll() and _set_log_handles()) to occur after the check
for parsed_args.command is None, so that help is printed and the function
returns before attempting to access the missing logging attributes.
| dim=-1, | ||
| ) | ||
| feat = torch.nan_to_num(feat, nan=0.0, posinf=0.0, neginf=0.0) | ||
| all_features.append(feat.cpu().numpy()) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🔴 Critical
🧩 Analysis chain
🏁 Script executed:
# First, let's explore the file structure and read the relevant section
ast-grep outline dpa_adapt/finetuner.py | head -50Repository: deepmodeling/deepmd-kit
Length of output: 951
🏁 Script executed:
# Read the file to understand the context around line 522
cat -n dpa_adapt/finetuner.py | sed -n '510,535p'Repository: deepmodeling/deepmd-kit
Length of output: 1232
🏁 Script executed:
# Search for coord_t creation and requires_grad usage
rg -n "coord_t.*requires_grad" dpa_adapt/finetuner.pyRepository: deepmodeling/deepmd-kit
Length of output: 161
🏁 Script executed:
# Search for all requires_grad usage in the file
rg -n "requires_grad" dpa_adapt/finetuner.pyRepository: deepmodeling/deepmd-kit
Length of output: 200
🏁 Script executed:
# Check ruff linting issues in the file
ruff check dpa_adapt/finetuner.pyRepository: deepmodeling/deepmd-kit
Length of output: 48554
🏁 Script executed:
# Read the full function context around line 522
cat -n dpa_adapt/finetuner.py | sed -n '460,530p'Repository: deepmodeling/deepmd-kit
Length of output: 3443
🏁 Script executed:
# Search for coord_t creation and descriptor hook setup
rg -n "coord_t" dpa_adapt/finetuner.py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 677
🏁 Script executed:
# Check if there's any gradient manipulation in the descriptor extraction
rg -n "descriptor.*hook\|_enable_hook\|_disable_hook" dpa_adapt/finetuner.py -B 2 -A 2Repository: deepmodeling/deepmd-kit
Length of output: 161
Detach descriptor tensors before converting to NumPy.
Line 522 will crash at runtime because coord_t requires grad (line 491), so all derived tensors including feat require grad. PyTorch forbids calling .numpy() on gradient-enabled tensors.
Proposed fix
- all_features.append(feat.cpu().numpy())
+ all_features.append(feat.detach().cpu().numpy())🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@dpa_adapt/finetuner.py` at line 522, The line appending to all_features is
attempting to convert a gradient-enabled tensor to NumPy, which PyTorch forbids.
Since `feat` is derived from `coord_t` which requires grad (line 491), it
inherits gradient tracking. To fix this, detach the tensor from the computation
graph before converting it to NumPy by calling .detach() on `feat` before
.cpu().numpy() in the all_features.append call.
Summary
This PR adds DPA-ADAPT, a toolkit for adapting pretrained DPA models to downstream atomistic property prediction tasks.
The new package provides a scikit-learn-style Python API and standalone CLI for fine-tuning, descriptor extraction, prediction, evaluation, cross-validation, and data preparation, without requiring users to manually write DeePMD-kit training input files.
Main changes
dpa_adaptPython package.dpa-adaptdpaadfrozen_sklearn: frozen DPA descriptors with scikit-learn regressorsfrozen_head: train a property head on top of a frozen DPA backbonefinetune: end-to-end DPA fine-tuningmft: multi-task fine-tuning with auxiliary energy/force trainingfparam.npydoc/dpa_adapt/.examples/dpa_adapt/.dpa-adaptoptional dependencies inpyproject.toml.source/tests/dpa_adapt/.Co-authored-by: zirenjin <zirenjin@umich.edu>
Summary by CodeRabbit
Release Notes
New Features
dpa-adapt/dpaad) and Python APIDocumentation