Skip to content

Unit system overhaul#217

Open
henrikjacobsenfys wants to merge 38 commits into
developfrom
unit-system-overhaul
Open

Unit system overhaul#217
henrikjacobsenfys wants to merge 38 commits into
developfrom
unit-system-overhaul

Conversation

@henrikjacobsenfys

@henrikjacobsenfys henrikjacobsenfys commented Jun 30, 2026

Copy link
Copy Markdown
Member

PR: Unit system overhaul

Branch: unit-system-overhauldevelop


Summary

The single unit attribute on all model objects is replaced by two separate attributes, x_unit (independent variable / energy axis) and y_unit (model output / intensity). This distinction was needed to correctly handle parameter units, area normalisation, and unit conversion in ParameterAnalysis. Several pre-existing bugs were found and fixed during the work.


Unit system changes

EasyDynamicsModelBase

The single unit parameter and property is replaced by x_unit (default 'meV') and y_unit (default 'dimensionless'). Both properties are read-only via setters that raise AttributeError; the corresponding mutation methods are convert_x_unit and convert_y_unit. The backing fields _unit_x_unit / _y_unit are updated everywhere in the hierarchy.

ModelComponent and all concrete components

x_unit and y_unit are threaded through Lorentzian, Gaussian, DeltaFunction, Exponential, Voigt, DampedHarmonicOscillator, Polynomial, and ExpressionComponent. Each component gains convert_x_unit and convert_y_unit methods with rollback on failure.

The evaluate(x) method on all components gains an output keyword ('numpy' by default, 'scipp' to return a sc.Variable with y_unit attached).

_prepare_x_for_evaluate now returns a (x_vals, detected_unit, dim) tuple instead of just the raw values. A new helper _resolve_param_value converts each parameter into the unit detected from x at call time, so components handle mixed units transparently without mutating any stored parameters.

Area parameter unit is now x_unit * y_unit

Previously area was stored in the same unit as the energy axis. The CreateParametersMixin._create_area_parameter helper now derives the area unit as str(sc.Unit(x_unit) * sc.Unit(y_unit)). For the default y_unit='dimensionless' this is unchanged, but it makes the area semantics correct when a physical y_unit is set.

ComponentCollection

convert_unit is split into convert_x_unit and convert_y_unit, both with atomic rollback. The evaluate and evaluate_component methods gain the same output keyword. to_dict / from_dict are updated accordingly ('unit' key → 'x_unit' / 'y_unit').

ModelBase, SampleModel, InstrumentModel, ResolutionModel, BackgroundModel

All accept x_unit / y_unit constructor arguments and propagate convert_x_unit / convert_y_unit down to their ComponentCollection children.

Convolution stack (ConvolutionBase, NumericalConvolutionBase, Convolution, AnalyticalConvolution, NumericalConvolution)

unitx_unit / y_unit throughout. ConvolutionBase.convert_x_unit is a transaction that converts the energy array, energy offset, and both component collections atomically (with rollback). A new ConvolutionBase.convert_y_unit propagates the new unit to the sample components only (resolution is dimensionless by construction). Convolution.convert_y_unit additionally propagates to its internal analytical and numerical sub-convolvers.

DiffusionModelBase and diffusion models

unitx_unit / y_unit signature throughout DiffusionModelBase, DeltaLorentz, BrownianTranslationalDiffusion, and JumpTranslationalDiffusion.

Analysis1d

References to self.unit (the old single unit) replaced by self.x_unit wherever the energy axis is meant.

ParameterAnalysis

A new private method _get_unit_conversions(binding, pname) computes (x_factor, y_factor) scale factors that convert the stored Q-coordinate and parameter units into the model's declared x_unit / y_unit. The factors are applied before passing data to the model callable and to the fit, and are also applied when building the model overlay curves. DiffusionModelBase bindings always return (1.0, 1.0) because their callables take raw Q values.

utils.py

Two new public helpers: verify_Q_index(Q_index, Q) (shared validation extracted from Experiment and AnalysisBase); energy_to_scipp(energy, unit) (converts a raw numpy array into a sc.Variable consistently).

Experiment

_extract_x_y_weights_only_finite is made public (extract_x_y_weights_only_finite). Two new public methods are added: get_finite_energy_mask(Q_index) returns a boolean sc.Variable selecting finite points; get_masked_binned_data(Q_index) returns the single-Q data slice with non-finite points removed.


Bug fixes

1. ModelComponent.x_unit / y_unit returning 'None' as a string

What broke: x_unit and y_unit properties on ModelComponent were reading the old backing field _unit directly via str(self._unit). After the rename to _x_unit / _y_unit the field was None, so the property returned the literal string 'None' instead of None.

Fix: The properties now guard with str(self._x_unit) if self._x_unit is not None else None.

Regression test: test_y_unit_default, test_y_unit_setter_raises added for ModelComponent and every concrete component class.


2. ParameterAnalysis.fit and calculate_model_dataset ignoring unit mismatches

What broke: When fitting a Q-dependent model (e.g. a Lorentzian declared with x_unit='1/angstrom') against extracted parameters stored in different units (e.g. Q in '1/m'), the raw stored values were passed directly to the model callable without any conversion. This silently produced numerically wrong fits by several orders of magnitude.

Fix: _get_unit_conversions computes the correct x_factor and y_factor using sc.to_unit; both methods apply the factors before fitting and before building overlays.

Regression tests: test_fit_converts_x_unit, test_fit_converts_y_unit, test_calculate_model_dataset_converts_x_unit, test_calculate_model_dataset_converts_y_unit, test_fit_incompatible_x_unit_raises, test_fit_incompatible_y_unit_raises in test_parameter_analysis.py.


3. ParameterAnalysis._get_xyweight_from_dataset crashing on NaN variances

What broke: When only a subset of Q values has a fitted value for a given parameter, parameters_to_dataset fills np.nan for the missing rows. The old weight calculation saw np.nan variance and raised ValueError('Non-finite variances found'), making it impossible to fit Q-dependent models over partial Q ranges.

Fix: NaN variances are now identified separately (nan_mask) and filtered silently. Only non-NaN non-finite or non-positive variances raise; only the valid rows are returned as (q_values[valid_mask], values[valid_mask], 1/sqrt(variances[valid_mask])).

Regression tests: test_get_xyweight_from_dataset_nan_variance_filters_row, test_get_xyweight_from_dataset_all_nan_variances_raises in test_parameter_analysis.py.


4. FitBinding._get_parameter_names returning wrong names when 'delta' was in the mode list

What broke: For DiffusionModelBase bindings, if 'delta' appeared anywhere in the mode list, the method returned '{name} area' for every mode in the list — including non-delta modes like 'lorentzian'. This caused the fitter to look for a parameter named e.g. 'D area' instead of 'D lorentzian', making diffusion fitting silently broken.

Fix: The special if 'delta' in modes branch is removed. The method now simply returns [f'{name} {mode}' for mode in modes] for all diffusion models.

Regression test: test_build_diffusion_callable_delta_mode in test_fit_binding.py.


5. FitBinding treating a bare string modes as a character iterator

What broke: FitBinding(parameter_name='D', model=model, modes='lorentzian') — passing a single string instead of a list — caused the mode list to be iterated character by character (['l', 'o', 'r', ...]) rather than treated as a single mode.

Fix: A guard added in __init__: if isinstance(modes, str): modes = [modes].

Regression test: test_init_with_string_modes in test_fit_binding.py.


6. Energy offset unit mismatch raising an error for compatible units

What broke: Analysis1d._apply_energy_offset checked energy.unit != energy_offset.unit and, if different, attempted energy_offset.convert_unit(str(energy.unit)) — which permanently mutated the stored energy_offset parameter. If units were compatible but not identical (e.g. energy grid in meV, offset parameter in eV), this silently changed the parameter's unit, and in the convolution layer the same energy_offset.value was subtracted without conversion.

Fix: Both Analysis1d._apply_energy_offset and ConvolutionBase._energy_with_offset now use sc.to_unit(energy_offset.full_value, energy.unit).value — a non-mutating conversion — so the stored parameter is never changed.

Regression tests: test_energy_with_offset_unit_conversion in test_convolution_base.py; test_convolution_with_energy_offset_in_different_unit in test_numerical_convolution.py.


7. Analysis1d.calculate mutating the stored convolver

What broke: calculate(energy=...) (the public plotting path) overwrote self._convolver with a convolver built for the plotting energy grid, then immediately set self._convolver_is_dirty = True. The dirty flag was intended to force a rebuild on the next fit(), but it also meant the convolver was always rebuilt on the next fit() call even if nothing had changed — adding unnecessary cost. Worse, if fit() was called very quickly after calculate(), the state was confusing.

Fix: calculate now creates a local convolver variable for its own use and passes it directly to _calculate, leaving self._convolver completely untouched.


8. Analysis._ensure_analysis_list_current clearing the dirty flag when Q is None

What broke: The logic was:

if self._analysis_list_is_dirty:
    if self.Q is not None:
        self._create_analysis_list()
    self._analysis_list_is_dirty = False   # ← ran even when rebuild was skipped

If Q was not yet set, the dirty flag was cleared without rebuilding. A subsequent analysis.Q = ... assignment would then not trigger a rebuild because the flag was already clean.

Fix: Collapsed to if self._analysis_list_is_dirty and self.Q is not None: ... self._analysis_list_is_dirty = False, so the flag is only cleared when the rebuild actually runs.

Regression test: test_ensure_analysis_list_current_stays_dirty_when_Q_is_none in test_analysis.py.


9. Analysis1d._on_Q_index_changed crashing when Q index was set to None

What broke: During Analysis1d.__init__, Q_index is initialised to None. The _on_Q_index_changed callback immediately tried to call self.experiment.get_masked_energy(Q_index=None), which raised because None is not a valid index.

Fix: Early return added: if self._Q_index is None: self._masked_energy = None; self._convolver_is_dirty = True; return.


10. Analysis1d._on_experiment_changed leaving _masked_energy stale

What broke: Swapping the experiment on an already-initialised Analysis1d (i.e. after Q_index was set) called the base class handler but did not re-fetch the masked energy for the current Q index. Subsequent calls to _calculate used the energy grid from the old experiment.

Fix: A check is added that re-fetches _masked_energy when both _Q_index and the new experiment are non-None.


11. binned_data['Q', idx] bypassing data masks in residuals and plots

What broke: Analysis1d._create_residuals_array and data_and_model_to_datagroup used self.experiment.binned_data['Q', self.Q_index] directly, giving the full data slice including NaN / masked points. The model prediction, however, was evaluated only on the finite energy grid (via get_masked_energy). When NaN points existed, the two arrays had different lengths and the subtraction crashed.

Fix: Both sites now call self.experiment.get_masked_binned_data(Q_index=self.Q_index), which applies the same finite-energy mask as the model path.

Regression tests: test_create_residuals_array_with_nan_data_does_not_crash, test_data_and_model_to_datagroup_with_nan_excludes_nan_from_data in test_analysis1d.py. test_fit_all_Q_simultaneously_with_nan_data in test_analysis.py. New Experiment methods are covered by test_get_masked_binned_data_with_data, test_get_finite_energy_mask_with_data in test_experiment.py.


12. Shared ConvolutionSettings across Q indices in multi-Q Analysis

What broke: All Analysis1d child objects created by Analysis._create_analysis_list received the same ConvolutionSettings instance. convolution_plan_is_valid is a flag on that object. When any one Q slice invalidated the plan (e.g. by updating energy), the flag was cleared for every other Q slice too, forcing unnecessary full plan rebuilds.

Fix: Each Analysis1d now receives its own copy(self.convolution_settings) so the validity flag is tracked independently per Q index.


13. NumericalConvolution detailed balance factor missing the even-length offset

What broke: NumericalConvolution.convolution applied the energy-even-length offset when evaluating the sample model values, but the detailed_balance_factor was computed on energy_dense - energy_offset (without the even-length offset). This meant the detailed balance correction used a slightly shifted energy grid relative to the sample model, introducing a small asymmetry artefact on even-length energy arrays.

Fix: The detailed balance factor is now computed on energy_dense - energy_even_length_offset - offset_value, consistent with how the sample model is evaluated.

Regression test: test_detailed_balance_energy_includes_even_length_offset in test_numerical_convolution.py.


14. NumericalConvolutionBase.energy setter not updating _energy_grid

What broke: Setting a new energy array via the energy property setter invalidated convolution_plan_is_valid but did not rebuild _energy_grid. Any calculation performed before the next full plan build would use a stale energy grid.

Fix: self._energy_grid = self._create_energy_grid() is called inside the setter.


15. analysis.parameters_to_dataset empty-list names check

What broke: if not names: is True for both None and an empty list []. Calling parameters_to_dataset(names=[]) was intended to return an empty dataset but instead returned the full dataset (all parameters).

Fix: Changed to if names is None: so an empty list correctly produces an empty output.


16. analysis.parameters_to_dataset permanently mutating parameters during unit conversion

What broke: When two parameters with the same name but different units were encountered, the code called p.convert_unit(target_unit) on the live Parameter object to normalise units before inserting into the dataset. This permanently changed the unit of the parameter in the model, so subsequent fits or evaluations used the wrong unit.

Fix: p = copy(p) before calling p.convert_unit(...), so the stored parameter is never mutated.


henrikjacobsenfys and others added 30 commits June 22, 2026 22:30
Incorporated docstring examples from develop while preserving the unit-system-overhaul
API (x_unit/y_unit split). Kept HEAD repr format and API throughout; updated
ExpressionComponent class docstring example from unit= to x_unit=.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Develop's cleanly-merged changes introduced self.unit (old API) in __repr__ methods
of ResolutionModel, Convolution, NumericalConvolution, AnalyticalConvolution, and
BackgroundModel. Updated all to use self.x_unit. Added missing __init__ docstrings
to ResolutionModel and SampleModel. Updated test assertions to match HEAD repr format.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
henrikjacobsenfys and others added 3 commits June 30, 2026 16:57
Covers previously untested branches added in the unit-system overhaul:
- FitBinding: string modes in __init__, delta mode in _build_diffusion_callable
- ConvolutionBase: convert_x/y_unit when components are None
- ParameterAnalysis: no-variance and all-NaN-variance paths in
  _get_xyweight_from_dataset; None x_unit and None y_unit in
  _get_unit_conversions
- Experiment: load_hdf5 TypeError for non-DataArray return value;
  get_finite_energy_mask and get_masked_binned_data with and without data
- Analysis / Analysis1d: __repr__ methods

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…backing-field reads with property calls

When _x_unit or _y_unit was None, ModelComponent.x_unit returned the string
'None', making unit-None checks always evaluate as truthy. Fixed the properties
to return None when the backing field is None, then replaced all ~90 direct
self._x_unit/self._y_unit reads across 13 source files with the property so the
fix propagates everywhere. Also corrects the InstrumentModel.x_unit annotation
(adds | None) and upgrades two SimpleNamespace test mocks to real Polynomial
objects now that the property behaves correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@henrikjacobsenfys henrikjacobsenfys added [scope] enhancement Adds/improves features (major.MINOR.patch) [priority] medium Normal/default priority labels Jun 30, 2026
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.46734% with 39 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.94%. Comparing base (ee07ccf) to head (97fece1).

Files with missing lines Patch % Lines
src/easydynamics/analysis/analysis1d.py 63.15% 4 Missing and 3 partials ⚠️
src/easydynamics/sample_model/model_base.py 79.41% 5 Missing and 2 partials ⚠️
src/easydynamics/convolution/convolution_base.py 87.50% 1 Missing and 4 partials ⚠️
...ynamics/sample_model/components/model_component.py 93.33% 5 Missing ⚠️
src/easydynamics/convolution/convolution.py 57.14% 1 Missing and 2 partials ⚠️
...cs/sample_model/components/expression_component.py 87.50% 1 Missing and 1 partial ⚠️
src/easydynamics/sample_model/components/mixins.py 60.00% 1 Missing and 1 partial ⚠️
...easydynamics/sample_model/components/polynomial.py 95.74% 1 Missing and 1 partial ⚠️
src/easydynamics/sample_model/sample_model.py 75.00% 1 Missing and 1 partial ⚠️
src/easydynamics/utils/utils.py 90.00% 2 Missing ⚠️
... and 2 more
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #217      +/-   ##
===========================================
+ Coverage    97.89%   97.94%   +0.04%     
===========================================
  Files           52       52              
  Lines         3653     3934     +281     
  Branches       650      680      +30     
===========================================
+ Hits          3576     3853     +277     
- Misses          45       47       +2     
- Partials        32       34       +2     
Flag Coverage Δ
unittests 97.94% <93.46%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/easydynamics/analysis/analysis.py 96.39% <100.00%> (+1.35%) ⬆️
src/easydynamics/analysis/analysis_base.py 100.00% <100.00%> (ø)
src/easydynamics/analysis/fit_binding.py 100.00% <100.00%> (+5.12%) ⬆️
src/easydynamics/analysis/parameter_analysis.py 99.53% <100.00%> (+1.16%) ⬆️
...asydynamics/base_classes/easydynamics_modelbase.py 100.00% <100.00%> (ø)
...easydynamics/convolution/analytical_convolution.py 98.76% <ø> (ø)
.../easydynamics/convolution/numerical_convolution.py 100.00% <100.00%> (+4.16%) ⬆️
...dynamics/convolution/numerical_convolution_base.py 97.39% <100.00%> (+0.02%) ⬆️
src/easydynamics/experiment/experiment.py 98.90% <100.00%> (+1.24%) ⬆️
src/easydynamics/sample_model/background_model.py 100.00% <ø> (ø)
... and 24 more

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[priority] medium Normal/default priority [scope] enhancement Adds/improves features (major.MINOR.patch)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow suppressing warnings about negative values in polynomial Make unit handling more robust

1 participant