Skip to content

Data loading robustness: 5-column files crash, deprecated sc.concatenate, silent ORSO fallback, unvalidated error column #376

Description

@rozyczko

Summary

Several robustness gaps in src/easyreflectometry/data/measurement.py and orso_utils.py:

  1. 5+-column files crash. _load_txt (measurement.py:96-100) unpacks exactly 4 names in the num_columns >= 4 branch; a 5-column file (q, R, sR, sQ, lambda — common) raises "too many values to unpack" instead of taking the first four.
  2. sc.concatenate is the old scipp API. merge_datagroups calls sc.concatenate([a, b]) (measurement.py:131, 137); current scipp uses sc.concat(list, dim). No test covers the merge path, so this is likely broken at runtime.
  3. Silent format fallback. load (measurement.py:24-27) catches (IndexError, ValueError) from the ORSO loader and falls back to the bare-text loader — a malformed .ort loses its header (and resolution-column semantics, metadata) without any warning. Also load_orso_data (orso_utils.py:230) assumes 4 columns, so a legal 3-column .ort (sQz optional in the ORSO spec) takes this silent fallback path.
  4. Error-column semantics unvalidated. _load_txt treats column 3 as sigma and squares it (measurement.py:108) — correct for the common convention but undocumented; a file carrying variance is silently mis-scaled by sigma^4.
  5. ORSO SLD units read heuristically. _get_sld_values (orso_utils.py:166-204) assumes file SLDs are absolute A^-2 and multiplies by 1e6, ignoring the unit field orsopy provides; the >1e-2 magnitude warning is a guard, but reading the declared unit would be correct rather than heuristic.

Suggested fix

Take the first four columns explicitly (data[:, :4].T), move to sc.concat with a test, warn on ORSO→txt fallback, document/validate the error-column convention, and honor orsopy's unit metadata.

Found during deep code review (DEEP_ANALYSIS.md §4.9).

Metadata

Metadata

Assignees

No one assigned

    Labels

    [priority] mediumNormal/default priority[scope] bugBug report or fix (major.minor.PATCH)
    No fields configured for issues without a type.

    Projects

    Status
    Bugs

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions