Skip to content

Migrate core MLflow integration from file store to SQLite#8894

Open
Lawson-Darrow wants to merge 5 commits into
Project-MONAI:devfrom
Lawson-Darrow:mlflow-sqlite-migration
Open

Migrate core MLflow integration from file store to SQLite#8894
Lawson-Darrow wants to merge 5 commits into
Project-MONAI:devfrom
Lawson-Darrow:mlflow-sqlite-migration

Conversation

@Lawson-Darrow
Copy link
Copy Markdown

Closes #8891. Implements Option 2 from the issue discussion.

MLflow 3.13+ raises for the filesystem (file store) tracking backend unless MLFLOW_ALLOW_FILE_STORE=true is set. This migrates MONAI core MLflow usage off the file store to the recommended SQLite backend.

  • MLFlowHandler defaults to sqlite:///<cwd>/mlruns.db when no tracking_uri is given and MLFLOW_TRACKING_URI is unset. For any local SQLite tracking_uri it derives an mlruns artifact directory next to the database file, mirroring the previous layout. Remote and other backends are left to MLflow to decide.
  • Adds an optional artifact_location argument to MLFlowHandler to override the derived location. Any explicit tracking_uri (including a file:// URI) or env var is passed through unchanged.
  • DEFAULT_MLFLOW_SETTINGS now uses a SQLite tracking_uri under <output_dir>/mlruns.db, keeping artifacts under <output_dir>/mlruns.
  • Adds monai.utils.path_to_sqlite_uri to build a cross-platform SQLite URI.
  • Migrates the bundle-run and federated-learning tests off file-store tracking URIs and cleans up the derived artifact directory.

The model zoo bundles that pass an explicit file URI (e.g. vista3d) are left for a separate PR as discussed in the issue.

Unit tests cover the SQLite default, explicit tracking_uri passthrough, explicit and derived artifact_location, remote tracking_uri (no local artifact root injected), and a basic run flow.

MLflow 3.13+ raises for the filesystem (file store) tracking backend unless
MLFLOW_ALLOW_FILE_STORE=true is set. Migrate MONAI core MLflow usage off the
file store to the recommended SQLite backend.

- MLFlowHandler defaults to sqlite:///<cwd>/mlruns.db when no tracking_uri is
  given and MLFLOW_TRACKING_URI is unset. For any local SQLite tracking_uri it
  derives an mlruns artifact directory next to the database file, mirroring the
  old layout. Remote and other backends are left to MLflow to decide.
- Add an optional artifact_location argument to MLFlowHandler to override the
  derived location. Any explicit tracking_uri or env var is passed through.
- DEFAULT_MLFLOW_SETTINGS uses a sqlite tracking_uri under <output_dir>/mlruns.db.
- Add monai.utils.path_to_sqlite_uri to build a cross-platform SQLite URI.
- Migrate the bundle-run and federated-learning tests off file-store tracking
  URIs and clean up the derived artifact directory.

The model zoo bundles that pass an explicit file URI are left for a separate PR.

Closes Project-MONAI#8891.

Signed-off-by: Lawson Darrow <lawson.darrow@gmail.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jun 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: df42c42b-7e0a-49e3-9f49-fac4b78a6da5

📥 Commits

Reviewing files that changed from the base of the PR and between 6171526 and e6c76e0.

📒 Files selected for processing (1)
  • tests/fl/monai_algo/test_fl_monai_algo.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/fl/monai_algo/test_fl_monai_algo.py

📝 Walkthrough

Walkthrough

This PR replaces filesystem MLflow defaults with SQLite-backed tracking URIs (sqlite:///.../mlruns.db), adds monai.utils.path_to_sqlite_uri and re-exports it, updates bundle docs/defaults to the new DB/artifact layout, extends MLFlowHandler with an artifact_location argument, derives artifact_location for local SQLite DBs and passes it to mlflow.create_experiment, adds defensive disposal of MLflow SQLite engines on close, and updates integration and unit tests to use SQLite URIs and adjusted cleanup/assertions.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 30.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title accurately summarizes the main change: migrating MLflow from file store to SQLite backend across the codebase.
Description check ✅ Passed Description covers the problem, solution, changes made, and mentions linked issue #8891. It follows the repository template structure with context and implementation details.
Linked Issues check ✅ Passed Changes address issue #8891 by defaulting MLFlowHandler to SQLite backend, allowing explicit filesystem URIs, preserving env var passthrough, and providing artifact location control to fix MLflow 3.13+ exceptions.
Out of Scope Changes check ✅ Passed All changes directly relate to the SQLite migration objective. Test updates, new utility functions, documentation updates, and handler modifications are all within scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/fl/monai_algo/test_fl_monai_algo.py (1)

205-234: ⚡ Quick win

Cleanup is inline, so a failed assertion pollutes the repo's testing_data.

The .db/mlruns artifacts are written into _data_dir (a tracked dir). Cleanup only runs after the assertTrue/assertGreater checks, so any earlier failure leaves SQLite files in the working tree.

Register cleanup up-front via self.addCleanup(...) so it runs even on failure. Note SQLite may emit *.db-wal/*.db-shm sidecars; the mlflow_* glob in test_evaluate covers them, but the explicit os.remove(mlflow_override.db) in test_train (Line 206) does not.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/fl/monai_algo/test_fl_monai_algo.py` around lines 205 - 234, The tests
clean up mlflow artifacts inline which can leave files in _data_dir on assertion
failures; change both test_train and test_evaluate to register cleanup actions
up-front using self.addCleanup so removal always runs, and in those cleanups
remove all mlflow-related files using glob patterns (e.g., "mlflow_*",
"mlflow_override.db", plus sidecars "*.db-wal" and "*.db-shm") and remove the
"mlruns" directory if present; locate cleanup registration near the start of
test_evaluate and test_train (references: test_evaluate, test_train, _data_dir)
and replace explicit os.remove/os.path.isdir removal at the end with these
guaranteed addCleanup handlers.
monai/utils/misc.py (1)

731-743: ⚡ Quick win

Complete the public helper docstring.

This new exported helper omits a Google-style Returns: section. Please document the exact URI format returned here, since the slash-count examples are part of the contract. As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."

Possible fix
 def path_to_sqlite_uri(path: PathLike) -> str:
     """
     Convert a database file path to a SQLite connection URI, e.g. for use as an MLflow
     ``tracking_uri``. If not an absolute path, it is converted to an absolute path first.
@@
     Args:
         path: input database file path, can be a string or `Path` object.
+
+    Returns:
+        A SQLite connection URI using an absolute POSIX-form path.
 
     """
     return f"sqlite:///{Path(path).absolute().as_posix()}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@monai/utils/misc.py` around lines 731 - 743, The docstring for
path_to_sqlite_uri is missing a Google-style Returns: section and exception
notes; update the public helper's docstring to include a Returns: that precisely
states the returned URI string format (e.g., on Windows
"sqlite:///C:/path/db.sqlite" and on POSIX "sqlite:////path/db.sqlite"), explain
that the function returns an absolute-path SQLite URI using POSIX forward
slashes, and add a Raises: section listing any obvious exceptions (e.g.,
TypeError if path is an invalid type or any exceptions propagated by
Path(path).absolute()); keep this documentation adjacent to the existing
description in the path_to_sqlite_uri function.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@monai/handlers/mlflow_handler.py`:
- Around line 187-197: The artifact_location derivation only runs when the local
tracking_uri arg is set; update the logic in mlflow_handler (around the
tracking_uri handling and the artifact_location assignment) to consider the
effective tracking URI coming from either the tracking_uri parameter or
os.environ["MLFLOW_TRACKING_URI"]: compute a single effective_tracking_uri
(e.g., tracking_uri or env var) and then, if self.artifact_location is None and
effective_tracking_uri startswith "sqlite:///", derive db_path via
Path(effective_tracking_uri[len("sqlite:///"):]) and set self.artifact_location
= path_to_uri(db_path.parent / "mlruns") using the existing
path_to_sqlite_uri/path_to_uri helpers so both explicit arg and env-provided
SQLite URIs result in the same sibling mlruns artifact location.

---

Nitpick comments:
In `@monai/utils/misc.py`:
- Around line 731-743: The docstring for path_to_sqlite_uri is missing a
Google-style Returns: section and exception notes; update the public helper's
docstring to include a Returns: that precisely states the returned URI string
format (e.g., on Windows "sqlite:///C:/path/db.sqlite" and on POSIX
"sqlite:////path/db.sqlite"), explain that the function returns an absolute-path
SQLite URI using POSIX forward slashes, and add a Raises: section listing any
obvious exceptions (e.g., TypeError if path is an invalid type or any exceptions
propagated by Path(path).absolute()); keep this documentation adjacent to the
existing description in the path_to_sqlite_uri function.

In `@tests/fl/monai_algo/test_fl_monai_algo.py`:
- Around line 205-234: The tests clean up mlflow artifacts inline which can
leave files in _data_dir on assertion failures; change both test_train and
test_evaluate to register cleanup actions up-front using self.addCleanup so
removal always runs, and in those cleanups remove all mlflow-related files using
glob patterns (e.g., "mlflow_*", "mlflow_override.db", plus sidecars "*.db-wal"
and "*.db-shm") and remove the "mlruns" directory if present; locate cleanup
registration near the start of test_evaluate and test_train (references:
test_evaluate, test_train, _data_dir) and replace explicit
os.remove/os.path.isdir removal at the end with these guaranteed addCleanup
handlers.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8395b304-ef2f-4c4e-8469-2ef1368131be

📥 Commits

Reviewing files that changed from the base of the PR and between 013bda0 and 7017510.

📒 Files selected for processing (8)
  • monai/bundle/scripts.py
  • monai/bundle/utils.py
  • monai/handlers/mlflow_handler.py
  • monai/utils/__init__.py
  • monai/utils/misc.py
  • tests/fl/monai_algo/test_fl_monai_algo.py
  • tests/handlers/test_handler_mlflow.py
  • tests/integration/test_integration_bundle_run.py

Comment on lines +187 to +197
# Only fall back to the SQLite default when the caller gave no tracking_uri and the
# `MLFLOW_TRACKING_URI` environment variable is unset, so that env-var configuration keeps
# working. When it is set, `tracking_uri` stays None and MLflow resolves the env var.
if not tracking_uri and not os.environ.get("MLFLOW_TRACKING_URI"):
tracking_uri = path_to_sqlite_uri(os.path.join(os.getcwd(), "mlruns.db"))
# For a local SQLite backend, keep run artifacts in an `mlruns` directory next to the
# database file (mirroring the previous file-store layout) unless the caller set
# `artifact_location`. Other backends (e.g. a remote server) are left to MLflow to decide.
if self.artifact_location is None and tracking_uri and tracking_uri.startswith("sqlite:///"):
db_path = Path(tracking_uri[len("sqlite:///") :])
self.artifact_location = path_to_uri(db_path.parent / "mlruns")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

🧩 Analysis chain

🌐 Web query:

In MLflow 3.x, if MLFLOW_TRACKING_URIpoints to a local SQLite database andcreate_experiment()is called withoutartifact_location, what default artifact location is used?

💡 Result:

In MLflow 3.x, if MLFLOW_TRACKING_URI points to a local SQLite database and create_experiment() is called without an explicit artifact_location, the default artifact location is a path inside the local file system, typically the ./mlruns directory [1][2][3]. When using a local SQLite backend, MLflow relies on the local file system for artifact storage unless a specific --default-artifact-root is configured on the tracking server [2][4]. If no artifact_location is provided during the creation of an experiment, MLflow defaults to a local path [1][3]. It is important to note that this local path is relative to the directory where the code is executing [1][2]. In production or multi-node environments, this default can lead to issues because the client and server may resolve the path to different physical locations on their respective disks [1][3]. To avoid this, it is recommended to explicitly set an artifact_location (such as an S3 bucket or other remote storage) when creating an experiment, or to configure the --default-artifact-root when starting the MLflow tracking server [1][2][5][3].

Citations:


Derive artifact_location consistently for both arg and MLFLOW_TRACKING_URI-provided SQLite

  • Current code derives sibling mlruns only when tracking_uri is explicitly passed; when MLFLOW_TRACKING_URI=sqlite:///..., tracking_uri stays None, so artifact_location remains unset and MLflow uses its default local artifact root (typically ./mlruns, relative to the process CWD). This causes different artifact placement for equivalent local SQLite setups (config split-brain).
Possible fix
         self.artifact_location = artifact_location
+        effective_tracking_uri = tracking_uri or os.environ.get("MLFLOW_TRACKING_URI")
         # Only fall back to the SQLite default when the caller gave no tracking_uri and the
         # `MLFLOW_TRACKING_URI` environment variable is unset, so that env-var configuration keeps
         # working. When it is set, `tracking_uri` stays None and MLflow resolves the env var.
-        if not tracking_uri and not os.environ.get("MLFLOW_TRACKING_URI"):
-            tracking_uri = path_to_sqlite_uri(os.path.join(os.getcwd(), "mlruns.db"))
+        if not effective_tracking_uri:
+            tracking_uri = effective_tracking_uri = path_to_sqlite_uri(os.path.join(os.getcwd(), "mlruns.db"))
         # For a local SQLite backend, keep run artifacts in an `mlruns` directory next to the
         # database file (mirroring the previous file-store layout) unless the caller set
         # `artifact_location`. Other backends (e.g. a remote server) are left to MLflow to decide.
-        if self.artifact_location is None and tracking_uri and tracking_uri.startswith("sqlite:///"):
-            db_path = Path(tracking_uri[len("sqlite:///") :])
+        if self.artifact_location is None and effective_tracking_uri and effective_tracking_uri.startswith("sqlite:///"):
+            db_path = Path(effective_tracking_uri[len("sqlite:///") :])
             self.artifact_location = path_to_uri(db_path.parent / "mlruns")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Only fall back to the SQLite default when the caller gave no tracking_uri and the
# `MLFLOW_TRACKING_URI` environment variable is unset, so that env-var configuration keeps
# working. When it is set, `tracking_uri` stays None and MLflow resolves the env var.
if not tracking_uri and not os.environ.get("MLFLOW_TRACKING_URI"):
tracking_uri = path_to_sqlite_uri(os.path.join(os.getcwd(), "mlruns.db"))
# For a local SQLite backend, keep run artifacts in an `mlruns` directory next to the
# database file (mirroring the previous file-store layout) unless the caller set
# `artifact_location`. Other backends (e.g. a remote server) are left to MLflow to decide.
if self.artifact_location is None and tracking_uri and tracking_uri.startswith("sqlite:///"):
db_path = Path(tracking_uri[len("sqlite:///") :])
self.artifact_location = path_to_uri(db_path.parent / "mlruns")
effective_tracking_uri = tracking_uri or os.environ.get("MLFLOW_TRACKING_URI")
# Only fall back to the SQLite default when the caller gave no tracking_uri and the
# `MLFLOW_TRACKING_URI` environment variable is unset, so that env-var configuration keeps
# working. When it is set, `tracking_uri` stays None and MLflow resolves the env var.
if not effective_tracking_uri:
tracking_uri = effective_tracking_uri = path_to_sqlite_uri(os.path.join(os.getcwd(), "mlruns.db"))
# For a local SQLite backend, keep run artifacts in an `mlruns` directory next to the
# database file (mirroring the previous file-store layout) unless the caller set
# `artifact_location`. Other backends (e.g. a remote server) are left to MLflow to decide.
if self.artifact_location is None and effective_tracking_uri and effective_tracking_uri.startswith("sqlite:///"):
db_path = Path(effective_tracking_uri[len("sqlite:///") :])
self.artifact_location = path_to_uri(db_path.parent / "mlruns")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@monai/handlers/mlflow_handler.py` around lines 187 - 197, The
artifact_location derivation only runs when the local tracking_uri arg is set;
update the logic in mlflow_handler (around the tracking_uri handling and the
artifact_location assignment) to consider the effective tracking URI coming from
either the tracking_uri parameter or os.environ["MLFLOW_TRACKING_URI"]: compute
a single effective_tracking_uri (e.g., tracking_uri or env var) and then, if
self.artifact_location is None and effective_tracking_uri startswith
"sqlite:///", derive db_path via
Path(effective_tracking_uri[len("sqlite:///"):]) and set self.artifact_location
= path_to_uri(db_path.parent / "mlruns") using the existing
path_to_sqlite_uri/path_to_uri helpers so both explicit arg and env-provided
SQLite URIs result in the same sibling mlruns artifact location.

With close_on_complete=True the handler closes the run on the engine COMPLETED
event, which sets cur_run to None, so reading handler.cur_run.info afterwards
raised AttributeError. Use close_on_complete=False and close explicitly after
the metric check, matching test_metrics_track.

Signed-off-by: Lawson Darrow <lawson.darrow@gmail.com>
On Windows the open SQLite connection kept the database file locked, so deleting
it during test cleanup raised PermissionError (WinError 32). close() now disposes
MLflow's SQLAlchemy engine for local SQLite tracking backends (no-op otherwise),
which is also a genuine resource-cleanup improvement. The construct-only handler
tests now call close() in a finally so their temp databases can be removed.

Signed-off-by: Lawson Darrow <lawson.darrow@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
monai/handlers/mlflow_handler.py (1)

186-197: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Handle env-provided local SQLite URIs the same way.

If MLFLOW_TRACKING_URI is set to sqlite:///..., Line 190 correctly skips the fallback, but Lines 195-197 also skip sibling mlruns derivation because they only inspect the tracking_uri arg. create_experiment() then gets artifact_location=None, so env-based local SQLite uses MLflow’s default artifact root instead of <db dir>/mlruns. A small effective_tracking_uri = tracking_uri or os.environ.get("MLFLOW_TRACKING_URI") fix here would keep the two config paths consistent and is worth a unit test too.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@monai/handlers/mlflow_handler.py` around lines 186 - 197, The
artifact-location derivation skips env-provided local SQLite URIs because the
code only checks the local argument tracking_uri; compute an
effective_tracking_uri = tracking_uri or os.environ.get("MLFLOW_TRACKING_URI")
(after the existing fallback logic that may set tracking_uri via
path_to_sqlite_uri) and use effective_tracking_uri for the sqlite:/// startswith
check and to derive db_path/artifact_location (referencing
self.artifact_location, tracking_uri, MLFLOW_TRACKING_URI, and
path_to_sqlite_uri); this ensures env-set sqlite URIs produce a sibling `mlruns`
artifact folder like the explicit tracking_uri path would.
🧹 Nitpick comments (1)
monai/handlers/mlflow_handler.py (1)

383-419: ⚡ Quick win

Complete the new docstrings.

_dispose_sqlite_store() and close() need Google-style Returns/Raises sections. Both methods now have non-trivial side effects and can propagate MLflow/engine disposal failures, so the contract should be explicit. As per coding guidelines, "Docstrings should be present for all definition which describe each variable, return value, and raised exception in the appropriate section of the Google-style of docstrings."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@monai/handlers/mlflow_handler.py` around lines 383 - 419, Add Google-style
Returns and Raises sections to the docstrings of _dispose_sqlite_store and
close: for _dispose_sqlite_store describe that it returns None and may raise
exceptions from MLflow internals or SQLAlchemy engine disposal (e.g.,
AttributeError, RuntimeError, sqlalchemy.exc.SQLAlchemyError) when reaching into
client internals or calling engine.dispose(); for close describe it returns
None, may raise MLflow exceptions when calling set_terminated or any exceptions
from _dispose_sqlite_store (document that exceptions from MLflow client or
engine disposal can propagate), and note side effects (terminates current run
and releases local SQLite resources). Ensure the docstrings for both include a
short Description, a Returns: "None", and a Raises: listing the possible
exception types and the conditions under which they are raised, referencing the
methods _dispose_sqlite_store and client.set_terminated in the text.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@monai/handlers/mlflow_handler.py`:
- Around line 186-197: The artifact-location derivation skips env-provided local
SQLite URIs because the code only checks the local argument tracking_uri;
compute an effective_tracking_uri = tracking_uri or
os.environ.get("MLFLOW_TRACKING_URI") (after the existing fallback logic that
may set tracking_uri via path_to_sqlite_uri) and use effective_tracking_uri for
the sqlite:/// startswith check and to derive db_path/artifact_location
(referencing self.artifact_location, tracking_uri, MLFLOW_TRACKING_URI, and
path_to_sqlite_uri); this ensures env-set sqlite URIs produce a sibling `mlruns`
artifact folder like the explicit tracking_uri path would.

---

Nitpick comments:
In `@monai/handlers/mlflow_handler.py`:
- Around line 383-419: Add Google-style Returns and Raises sections to the
docstrings of _dispose_sqlite_store and close: for _dispose_sqlite_store
describe that it returns None and may raise exceptions from MLflow internals or
SQLAlchemy engine disposal (e.g., AttributeError, RuntimeError,
sqlalchemy.exc.SQLAlchemyError) when reaching into client internals or calling
engine.dispose(); for close describe it returns None, may raise MLflow
exceptions when calling set_terminated or any exceptions from
_dispose_sqlite_store (document that exceptions from MLflow client or engine
disposal can propagate), and note side effects (terminates current run and
releases local SQLite resources). Ensure the docstrings for both include a short
Description, a Returns: "None", and a Raises: listing the possible exception
types and the conditions under which they are raised, referencing the methods
_dispose_sqlite_store and client.set_terminated in the text.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 66636e0e-e9f1-4f28-920b-cb8b8aa3c127

📥 Commits

Reviewing files that changed from the base of the PR and between e594111 and c81c40a.

📒 Files selected for processing (2)
  • monai/handlers/mlflow_handler.py
  • tests/handlers/test_handler_mlflow.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/handlers/test_handler_mlflow.py

The MLflow file-store to SQLite migration leaves an open SQLAlchemy SQLite
connection, which on Windows locks the .db file. The FL workflow can finish
without closing the tracking handler that holds mlflow_1.db, so
test_fl_monai_algo failed on full-dep (windows-latest) when os.remove'ing the
mlflow_*.db files.

Dispose the test's mlflow*.db SQLite engines before removing the files.

Signed-off-by: Lawson Darrow <lawson.darrow@gmail.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/fl/monai_algo/test_fl_monai_algo.py`:
- Around line 182-209: The broad except in _dispose_sqlite_engines() hides
unexpected errors when probing SQLAlchemy Engine.url; narrow the failure mode by
checking attributes or catching AttributeError only: verify hasattr(url,
"get_backend_name") and hasattr(url, "database") before calling
url.get_backend_name() and accessing url.database (or replace the broad except
Exception with except AttributeError) so only missing-attribute issues are
swallowed while other errors surface; reference Engine, url.get_backend_name,
and url.database in your change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 264a7262-eca1-4fdf-acdd-f5f950eebd2d

📥 Commits

Reviewing files that changed from the base of the PR and between c81c40a and 6171526.

📒 Files selected for processing (1)
  • tests/fl/monai_algo/test_fl_monai_algo.py

Comment on lines +182 to +209
def _dispose_sqlite_engines():
"""Dispose MLflow's open SQLAlchemy SQLite engines so the test ``.db`` files can be removed.

MLflow keeps a SQLite connection open for the lifetime of its client; on Windows that
locks the database file and breaks cleanup. ``MLFlowHandler.close()`` releases it, but a
workflow may finish without closing every handler, so dispose defensively here before
deleting the files. Scoped to the test's ``mlflow*.db`` backends so unrelated (e.g.
in-memory) sqlite engines elsewhere in the process are left untouched.
"""
import gc

try:
from sqlalchemy.engine import Engine
except ImportError:
return
gc.collect()
for obj in gc.get_objects():
if not isinstance(obj, Engine):
continue
try:
url = obj.url
db = url.database if url.get_backend_name() == "sqlite" else None
# the test backends are all files named ``mlflow*.db``; match those only so
# unrelated (e.g. in-memory) sqlite engines in the process are left untouched.
if db and os.path.basename(db).startswith("mlflow"):
obj.dispose()
except Exception:
pass
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Confirm SQLAlchemy is pinned/available and inspect the URL API.
fd -t f 'requirements.*\.txt|setup\.cfg|pyproject\.toml' | xargs rg -n -i 'sqlalchemy'

Repository: Project-MONAI/MONAI

Length of output: 47


🌐 Web query:

Does SQLAlchemy URL expose get_backend_name() and a database attribute in 1.4 and 2.0?

💡 Result:

Yes, the SQLAlchemy URL object exposes both the get_backend_name method and the database attribute in both SQLAlchemy 1.4 and 2.0 [1][2]. Both were formalized as part of the URL object's overhaul in version 1.4, where the URL object was redesigned to be an immutable namedtuple-like structure [3][4]. The get_backend_name method returns the database backend name (the portion of the drivername to the left of the plus sign) [1][2], and the database attribute provides access to the database name specified in the URL [1][2].

Citations:


🏁 Script executed:

#!/bin/bash
# Find where SQLAlchemy is declared/pinned in this repo.
rg -n --hidden --no-ignore-vcs -S "sqlalchemy" pyproject.toml setup.cfg setup.py requirements*.txt */requirements*.txt 2>/dev/null || true

Repository: Project-MONAI/MONAI

Length of output: 45


🌐 Web query:

sqlalchemy version pins MONAI supported versions (sqlalchemy) fl monai_algo?

💡 Result:

MONAI does not appear to pin SQLAlchemy versions. What I found in MONAI packaging/dependency metadata: - MONAI’s declared install requirements in setup.cfg list torch and numpy, plus many optional dependencies under extras; SQLAlchemy is not present in the shown install_requires or extras list [1][2]. - The MONAI PyPI page describes dependency versioning for major releases and points to requirements*.txt files, but the dependency table excerpt shown there does not include SQLAlchemy [3]. So there is no explicit “sqlalchemy version pin” I can cite from MONAI’s published dependency metadata for the monai/fl client algorithm component (monai_algo.py) either—monai_algo.py itself is code for FL client behavior and does not specify SQLAlchemy dependencies [4]. If you meant “MONAI FL (monai_algo) indirectly depends on SQLAlchemy via another library”, tell me which MONAI version you’re using (e.g., 1.3.0/1.4.0/1.5.x) and your install extras (e.g., monai[all]) and I can trace the transitive dependency chain from that exact lockfile/requirements output.

Citations:


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# List likely dependency declaration files
fd -t f -i 'requirements*.txt|setup.cfg|pyproject.toml|setup.py' . || true

# Search for sqlalchemy/sqlalchemy in those files
for f in $(fd -t f -i 'requirements*.txt|setup.cfg|pyproject.toml|setup.py' .); do
  rg -n -i 'sqlalchemy' "$f" || true
done

Repository: Project-MONAI/MONAI

Length of output: 146


🏁 Script executed:

#!/bin/bash
set -euo pipefail
# Confirm there aren't other dependency declaration files referencing sqlalchemy.
rg -n -i --hidden --no-ignore-vcs "sqlalchemy" \
  -g 'requirements*.txt' -g 'setup.cfg' -g 'pyproject.toml' -g 'setup.py' -g 'Pipfile*' -g '*.yml' -g '*.yaml' \
  . | head -n 50

Repository: Project-MONAI/MONAI

Length of output: 45


Narrow the failure mode of _dispose_sqlite_engines() (SQLAlchemy URL API)

  • SQLAlchemy’s URL object includes both get_backend_name() and database in the 1.4/2.0 API, so the current attribute access matches upstream.
  • MONAI repo doesn’t pin SQLAlchemy in the searched dependency metadata; to avoid silently neutering cleanup on any transitive/version drift, narrow the except Exception: to expected errors (e.g., AttributeError) or gate with hasattr(url, "get_backend_name") / hasattr(url, "database") so failures aren’t fully swallowed.
🧰 Tools
🪛 Ruff (0.15.15)

[error] 208-209: try-except-pass detected, consider logging the exception

(S110)


[warning] 208-208: Do not catch blind exception: Exception

(BLE001)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/fl/monai_algo/test_fl_monai_algo.py` around lines 182 - 209, The broad
except in _dispose_sqlite_engines() hides unexpected errors when probing
SQLAlchemy Engine.url; narrow the failure mode by checking attributes or
catching AttributeError only: verify hasattr(url, "get_backend_name") and
hasattr(url, "database") before calling url.get_backend_name() and accessing
url.database (or replace the broad except Exception with except AttributeError)
so only missing-attribute issues are swallowed while other errors surface;
reference Engine, url.get_backend_name, and url.database in your change.

gc.get_objects() can include dead weakref proxies, whose isinstance() raises
ReferenceError. The check sat outside the try/except, so it escaped and errored
the FL tests on Linux (full-dep ubuntu-latest). Move it inside the guard.

Signed-off-by: Lawson Darrow <lawson.darrow@gmail.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MLFlow 3.13 Raises Exception When Using Filesystem Backend

1 participant