Physics-constrained symbolic regression that discovers correction terms β not equations from scratch. The same logic that led from Newton to Einstein, from RayleighβJeans to Planck.
Science rarely discovers from a blank slate β it corrects. ADCD automates the step between anomaly and theory correction: given a classical law and data that disagrees with it, it searches for the minimal physically-valid correction term
$\Delta$ β passing every candidate through dimensional, asymptotic, and complexity gates before a single parameter is ever fit.
-
Correction-First Paradigm β Starts from a known classical law, not a blank slate. Focuses the search space on the discrepancy
$\Delta$ between theory and experiment. - Cascaded Physics Gates β AST complexity, dimensional homogeneity, transcendental guardrails, and asymptotic consistency (ARC) gates screen out unphysical candidates before running parameter-fitting.
- JAX-Traced L-BFGS-B Optimizer β Highly optimized parameter-scaled differentiable fitting with multi-restart log-uniform initialization.
- BIC Model Selection β Employs the Bayesian Information Criterion (BIC) to rank models, favoring simpler physical theories over overly complex numerical fits.
- Residual Feature Intelligence β Extracts mathematical features (monotonicity, curvature, oscillation, decay) from residuals to bias proposal templates.
- Phase 2: Multivariable Discovery β Buckingham Ξ group decomposition + per-variable Sequential ARC + variance-factorization separability detection for multi-input physical laws.
- Real-World Validated β Successfully identifies correct structural classes on Mercury's perihelion (GR), Lamb Shift (QED), Muon g-2 (Schwinger), and Blackbody (Planck).
Install the stable package from PyPI:
pip install adcdOr install from source:
git clone https://github.com/apiprdt/PhysicsPaper.git
cd PhysicsPaper
pip install -e ".[dev]"Verify your installation:
pytest tests/Running ADCD on predefined physics benchmarks is extremely simple:
import adcd
# 1. Load a pre-defined benchmark scenario (e.g. Relativistic Kinetic Energy)
scenarios = adcd.get_all_scenarios()
scenario = scenarios[0]
# 2. Run discovery in a single line!
result = adcd.discover_correction(scenario, max_iterations=5, proposer="mock")
# 3. View the best fit
print(f"Discovered correction: {result.best_expr}") # ΞΈβ * (v/c)**2
print(f"LaTeX representation: {result.export_latex()}") # \theta_0 \left(\frac{v}{c}\right)^2
print(f"Parameters: {result.best_theta}")
print(f"BIC Score: {result.best_bic:.2f}")
# 4. Plot residuals
result.plot_residuals()For custom datasets, use the adcd.fit function:
import numpy as np
import adcd
# Your custom data
x = np.linspace(1.0, 5.0, 100)
X = {"x": x}
y_classical = 2.0 * x
y_observed = 2.0 * x + 0.5 * x**2 # True correction is 0.5 * x^2
# Run ADCD
result = adcd.fit(
X=X,
y_obs=y_observed,
y_classical=y_classical,
limit_variable="x",
limit_direction="0",
correction_mode="additive"
)
result.summary()Headline (primary claim): a mean structural recovery of 80.4% (Β±7.4%) across sixteen independent seeds (95% bootstrap CI [76.7%, 84.0%]). The reference seed=42 below is disclosed explicitly as the highest-performing seed (94.4%) β the mean, not the peak, is the claim. Full per-seed Γ per-noise breakdown ships in
results/seed_distribution.json.
| Noise level | ADCD mean (16 seeds) | ADCD worst seed | ADCD best (seed=42) |
|---|---|---|---|
| 0% | 86.8% (Β±9.8%) | 66.7% (6/9) | 100% (9/9) |
| 1% | 81.2% (Β±14.6%) | 44.4% (4/9) | 100% (9/9) |
| 5% | 77.1% (Β±10.0%) | 66.7% (6/9) | 88.9% (8/9) |
| 10% | 76.4% (Β±12.3%) | 55.6% (5/9) | 88.9% (8/9) |
| Overall | 80.4% (Β±7.4%) | 69.4% (25/36) | 94.4% (34/36) |
| Scenario | Tier | 0% Noise | 1% Noise | 5% Noise | 10% Noise |
|---|---|---|---|---|---|
| Relativistic KE | Textbook | β | β | β | β |
| Yukawa Gravity | Textbook | β | β | β | β |
| Anharmonic Spring | Textbook | β | β | β | β |
| Screened Coulomb | Cross-Domain | β | β | β | β |
| Net Radiation | Cross-Domain | β | β | β | β |
| Nonlinear Drag | Cross-Domain | β | β | β | β |
| Mystery-A (tanhΒ²) | Synthetic | β | β | β | β |
| Mystery-B (sinc) | Synthetic | β | β | β | β |
| Mystery-C (log-quotient) | Synthetic | β | β | β | β |
| Overall | 100% | 100% | 88.9% | 88.9% |
The gap is seed-independent: even ADCD's worst of 16 seeds beats PySR fair, and PySR with doubled budget cannot reach ADCD's worst seed.
| Method (5% noise) | 0% | 1% | 5% | 10% |
|---|---|---|---|---|
| ADCD (ours, seed=42) | 9/9 (100%) | 9/9 (100%) | 8/9 (88.9%) | 8/9 (88.9%) |
| ADCD multi-seed mean | 86.8% | 81.2% | 77.1% | 76.4% |
| ADCD worst of 16 seeds | 66.7% | 44.4% | 66.7% | 55.6% |
| PySR fair (100 iter, 60s) | 4/9 (44.4%) | 5/9 (55.6%) | 1/9 (11.1%) | 5/9 (55.6%) |
| PySR generous (2Γ budget) | 4/9 (44.4%) | 4/9 (44.4%) | 5/9 (55.6%) | 2/9 (22.2%) |
At 5% noise the gap is +66.0 points (ADCD multi-seed mean 77.1% vs PySR fair 11.1%). Doubling PySR's budget (
generous, 55.6%) does not close it β and that doubled-budget figure still sits below ADCD's worst of 16 seeds (66.7%). PySR was run once per (scenario, noise); ADCD across 16 seeds. PySR non-monotonic under noise; ADCD stable.
| Scenario | Variables | ADCD Solved | Notes |
|---|---|---|---|
| Yukawa Mass-Ratio | m, M, r, rβ | β | Ξ groups: m/M, r/rβ |
| Turbulent Drag | v, Ο, A, C_D | β | Separable multiplicative |
| Coupled Oscillator | k, m, Ξ©, Οβ | β | Mixed functional form |
| Van der Waals MV | a, b, P, V, T | β | Requires 3rd Ξ group |
| Overall | 2/4 (50%) | Baseline: 0/4 |
Validation on historical anomalies using physical constants from JPL DE440, NIST, and CODATA:
| Physical Scenario | Discovered Correction | Converged | Class Match | NMSE |
|---|---|---|---|---|
| Mercury Perihelion (GR) | ΞΈβΒ·vcΒ² |
β | β polynomial | 1.11e-05 |
| Hydrogen Lamb Shift (QED) | ΞΈβ(n/ΞΈβ)^(-ΞΈβ) |
β | β power_law | 1.82e-18 |
| Muon g-2 (Schwinger) | ΞΈβ(Ξ±/Ο)^ΞΈβ |
β | β polynomial | 7.94e-07 |
| Blackbody (Planck) | -1 + e^(-f/ΞΈβ) |
β | β exponential | 2.59e-02 |
adcd-v3.0.0/
βββ src/adcd/ # Installable package
β βββ __init__.py # Public API (fit, discover_correction)
β βββ anomaly_scenarios.py # 9 standard + 3 blind + 4 multivariable scenarios
β βββ arc_scorer.py # Asymptotic consistency gate (ARC)
β βββ buckingham_pi.py # [Phase 2] Buckingham Ξ group engine
β βββ coarse_evaluator.py # Coarse numerical pre-filter
β βββ correction_orchestrator.py # Main multi-iteration discovery loop
β βββ dimensional_checker.py # Dimensional homogeneity + transcendental gate
β βββ jax_optimizer.py # JAX L-BFGS-B optimizer
β βββ llm_proposer.py # Mock + Gemini + OpenAI proposers
β βββ metrics.py # NMSE, BIC, structural classification
β βββ multivar_orchestrator.py # [Phase 2] Multivariable correction pipeline
β βββ pipeline.py # Stage 1 filter cascade
β βββ real_data_loader.py # Real-world data loading (JPL, NIST, CODATA)
β βββ residual_factorizer_v2.py # [Phase 2] Variance-decomposition separability
β βββ result.py # CorrectionResult object
β βββ sequential_arc.py # [Phase 2] Per-variable Sequential ARC checker
βββ tests/ # Unit + integration tests
βββ paper/ # LaTeX source (main.tex) + figures
βββ data/ # Input datasets (SPARC, cosmic chronometers, growth rate)
βββ scripts/ # Table generation and verification scripts
βββ run_correction_discovery.py # Benchmark runner
βββ README.md # This file
If you use ADCD in your research, please cite:
@software{erdita2026adcd,
author = {Erdita, Muhammad Afif},
title = {{Anomaly-Driven Correction Discovery (ADCD): Physics-Constrained
Symbolic Regression for Evolutionary Scientific Discovery}},
year = {2026},
publisher = {Zenodo},
version = {3.0.0},
doi = {10.5281/zenodo.20534940},
url = {https://doi.org/10.5281/zenodo.20534940}
}Every quantitative claim in this project is reproducible from committed scripts. No number is hand-typed.
# Regenerate the 9-scenario benchmark (seed=42)
python run_correction_discovery.py
# Multi-seed study (16 seeds Γ 9 scenarios Γ 4 noise levels)
python run_reproducibility.py
# Build the per-seed Γ per-noise anti-cherry-pick artifact
python scripts/generate_seed_distribution.py # β results/seed_distribution.json
# Guard: fails loudly if any headline number drifts
python scripts/verify_paper_claims.py
# SPARC MOND robustness study
python -m adcd.experiments.sparc_robustnessThe full test suite must pass before any release:
pytest tests/ -qThis project is licensed under the MIT License.
