Skip to content

implement namespace isolation for multi-embodiment blueprints with du…#2519

Open
Karan24Soni wants to merge 9 commits into
dimensionalOS:mainfrom
Karan24Soni:main
Open

implement namespace isolation for multi-embodiment blueprints with du…#2519
Karan24Soni wants to merge 9 commits into
dimensionalOS:mainfrom
Karan24Soni:main

Conversation

@Karan24Soni

@Karan24Soni Karan24Soni commented Jun 17, 2026

Copy link
Copy Markdown

Problem

so basically if you tried to run multiple robots in the same coordinator using namespaces, it just bricked. the worker booted up fine and listened on the namespaced topic (like robot1/ModuleA), but WorkerManagerPython completely ignored the namespace when building the RPCClient proxy. the proxy hardcoded its routing to just the base class name ModuleA.

because of this, the proxy would sit there for 120 seconds screaming into the void waiting for a response while the worker was listening somewhere else. totally blocked multi-embodiment setups where we need duplicate modules running side-by-side.

Solution

forced the proxy to actually respect the namespace so it routes correctly without touching the underlying rpc spec.

  • injected __dimos_rpc_name__ into the kwargs during initial deployment and inside _restart_module.
  • patched deploy and deploy_parallel in the coordinator so the exact second the proxy is handed back from the worker manager, we manually override proxy.remote_name = kwargs["__dimos_rpc_name__"].
  • now the proxy dynamically builds its method calls pointing exactly to the namespaced worker. no shared topic collisions, no timeouts.

you can now load autoconnect(...).namespace("robot1") and robot2 with the exact same module classes and it just works.

How to Test

added a brand new regression test to guarantee nobody breaks namespace isolation again:
PYTHONPATH=. python -m pytest dimos/core/tests/test_blueprint_namespaces.py -q --tb=short -x

and verify the reload/restart hooks are still totally green:
PYTHONPATH=. python -m pytest dimos/core/coordination/test_module_coordinator.py -q --tb=short -k "restart or reload or unload" -n auto -x

python -c 'from dimos.porcelain.dimos import Dimos; from dimos.robot.unitree.go2.blueprints.smart.go2_smart import go2_smart_blueprint; app = Dimos(); app.run(go2_smart_blueprint(replay=False).namespace("robot1"), go2_smart_blueprint(replay=False).namespace("robot2")); import time; time.sleep(1000)'

Run this in your terminal to see the multi-quadruped namespace isolation working in the simulator.

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes namespace isolation for multi-embodiment blueprints by injecting __dimos_rpc_name__ into module kwargs and patching proxy.remote_name immediately after deployment, so the RPC proxy routes to the correct namespaced worker topic instead of the bare class name. Several issues from the previous review round have been addressed (transport registry now uses 3-tuple keys with namespace, strict=True restored in deploy_parallel, _verify_no_conflicts_with_existing updated for the new key schema, get_module aligned with list_module_names).

  • BlueprintAtom gains a namespace field; Blueprint.namespace() stamps all atoms at once; _eliminate_duplicates now deduplicates on (namespace, module) so identical classes under different namespaces are kept.
  • _deploy_all_modules and _restart_module both inject framework keys into the deploy kwargs and strip them before creating new BlueprintAtom entries, preventing accumulation across restarts.
  • The new regression test (test_blueprint_namespaces.py) is structurally broken because it requests the dynamic_coordinator fixture, which is defined only inside test_module_coordinator.py rather than a conftest.py.

Confidence Score: 3/5

The namespace routing fix itself is sound, but the new regression test that is supposed to guard it is broken and will never run.

The core RPC proxy patching and transport registry changes look correct, and several previously flagged issues are properly addressed. However, the new regression test test_blueprint_namespaces.py requests a dynamic_coordinator fixture that is scoped to a different test file and invisible to pytest — the test will always fail with fixture not found, so the intended safety net is absent. There are also remaining inconsistencies such as _all_name_types ignoring namespace, and the public unload_module/restart_module API still does not accept a namespace parameter for namespaced modules.

dimos/core/tests/test_blueprint_namespaces.py — the dynamic_coordinator fixture must be moved to a conftest.py before this test will actually run.

Important Files Changed

Filename Overview
dimos/core/coordination/module_coordinator.py Core coordinator updated for namespace isolation: 3-tuple transport registry keys, __dimos_rpc_name__/__dimos_namespace__ injection in deploy paths, proxy remote_name patching, and updated restart_module_by_class_name with namespace param. Several previously flagged issues are now fixed (strict=True restored, _verify_no_conflicts_with_existing updated). _all_name_types still omits namespace from its output, creating an inconsistency with _verify_no_name_conflicts.
dimos/core/coordination/blueprints.py Added namespace field to BlueprintAtom and namespace() method to Blueprint. _eliminate_duplicates now keys on (namespace, module). Logic is correct for the multi-embodiment use case.
dimos/core/module.py ModuleBase.__init__ pops __dimos_rpc_name__ and __dimos_namespace__ from config_args before passing to super, then passes the rpc name to serve_module_rpc. Clean integration point; no issues found.
dimos/core/tests/test_blueprint_namespaces.py New regression test for namespace isolation — but it references the dynamic_coordinator fixture which is only defined inside test_module_coordinator.py, not in any conftest.py. Pytest cannot discover it across files, so this test will always fail with fixture not found.
dimos/porcelain/local_module_source.py get_module now correctly matches on the fully-qualified namespace/ClassName name, making it consistent with list_module_names. Previous mismatch bug is fixed.
dimos/core/coordination/test_module_coordinator.py Existing tests pass; the dynamic_coordinator fixture remains local to this file rather than being promoted to a conftest.py, which prevents the new test in dimos/core/tests/ from using it.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant App
    participant ModuleCoordinator
    participant WorkerManagerPython
    participant ModuleWorker

    App->>ModuleCoordinator: load_blueprint(autoconnect(bp1.namespace("robot1"), bp2.namespace("robot2")))
    ModuleCoordinator->>ModuleCoordinator: _deploy_all_modules() inject __dimos_namespace__ + __dimos_rpc_name__
    ModuleCoordinator->>WorkerManagerPython: deploy_parallel(module_specs)
    WorkerManagerPython->>ModuleWorker: spawn ModuleA(kwargs)
    ModuleWorker->>ModuleWorker: "pop __dimos_rpc_name__, serve_module_rpc(name="robot1/ModuleA")"
    WorkerManagerPython-->>ModuleCoordinator: "proxy (remote_name="ModuleA" by default)"
    ModuleCoordinator->>ModuleCoordinator: "proxy.remote_name = "robot1/ModuleA""
    ModuleCoordinator->>ModuleCoordinator: "_connect_streams() key=(namespace, stream_name, type) topic="/robot1/data1""
    ModuleCoordinator-->>App: coordinator ready
    App->>ModuleCoordinator: proxy.some_rpc_call()
    ModuleCoordinator->>ModuleWorker: RPC to topic robot1/ModuleA
    ModuleWorker-->>ModuleCoordinator: response
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant App
    participant ModuleCoordinator
    participant WorkerManagerPython
    participant ModuleWorker

    App->>ModuleCoordinator: load_blueprint(autoconnect(bp1.namespace("robot1"), bp2.namespace("robot2")))
    ModuleCoordinator->>ModuleCoordinator: _deploy_all_modules() inject __dimos_namespace__ + __dimos_rpc_name__
    ModuleCoordinator->>WorkerManagerPython: deploy_parallel(module_specs)
    WorkerManagerPython->>ModuleWorker: spawn ModuleA(kwargs)
    ModuleWorker->>ModuleWorker: "pop __dimos_rpc_name__, serve_module_rpc(name="robot1/ModuleA")"
    WorkerManagerPython-->>ModuleCoordinator: "proxy (remote_name="ModuleA" by default)"
    ModuleCoordinator->>ModuleCoordinator: "proxy.remote_name = "robot1/ModuleA""
    ModuleCoordinator->>ModuleCoordinator: "_connect_streams() key=(namespace, stream_name, type) topic="/robot1/data1""
    ModuleCoordinator-->>App: coordinator ready
    App->>ModuleCoordinator: proxy.some_rpc_call()
    ModuleCoordinator->>ModuleWorker: RPC to topic robot1/ModuleA
    ModuleWorker-->>ModuleCoordinator: response
Loading

Reviews (8): Last reviewed commit: "Merge branch 'main' into main" | Re-trigger Greptile

Comment on lines +290 to +296
for namespace, remapped_name, stream_type in streams.keys():
map_key = (remapped_name, stream_type)
if map_key in self._transport_registry:
transport = self._transport_registry[map_key]
else:
transport = _get_transport_for(blueprint, remapped_name, stream_type)
self._transport_registry[key] = transport
for module, original_name in streams[key]:
instance = self.get_instance(module) # type: ignore[assignment]
transport = _get_transport_for(blueprint, remapped_name, stream_type, namespace)
self._transport_registry[map_key] = transport

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Stream transport isolation broken for same-named streams across namespaces

_transport_registry is keyed on (remapped_name, stream_type) without the namespace, so when two namespaced blueprints share a stream name and type (e.g., both robot1 and robot2 declare a camera: Out[Image] stream), the second namespace processed finds the first namespace's transport in the registry and reuses it. Both robots end up wired to the same underlying data channel, defeating the namespace isolation this PR is meant to provide.

Concrete failure: deploy autoconnect(...).namespace("robot1") and autoconnect(...).namespace("robot2") where both modules declare a camera output — robot2's camera transport will be robot1's topic (/robot1/camera instead of /robot2/camera), and messages will be shared.

Suggested change
for namespace, remapped_name, stream_type in streams.keys():
map_key = (remapped_name, stream_type)
if map_key in self._transport_registry:
transport = self._transport_registry[map_key]
else:
transport = _get_transport_for(blueprint, remapped_name, stream_type)
self._transport_registry[key] = transport
for module, original_name in streams[key]:
instance = self.get_instance(module) # type: ignore[assignment]
transport = _get_transport_for(blueprint, remapped_name, stream_type, namespace)
self._transport_registry[map_key] = transport
for namespace, remapped_name, stream_type in streams.keys():
map_key = (namespace, remapped_name, stream_type)
if map_key in self._transport_registry:
transport = self._transport_registry[map_key]
else:
transport = _get_transport_for(blueprint, remapped_name, stream_type, namespace)
self._transport_registry[map_key] = transport

Comment thread dimos/core/coordination/module_coordinator.py
Comment on lines +547 to 558
kwargs["__dimos_namespace__"] = key.namespace
kwargs["__dimos_rpc_name__"] = f"{key.namespace}/{new_class.__name__}" if key.namespace else new_class.__name__

python_wm = cast("WorkerManagerPython", self._managers["python"])
new_proxy = python_wm.deploy_fresh(new_class, self._global_config, kwargs)
self._deployed_modules[new_class] = new_proxy
if hasattr(new_proxy, "remote_name"):
new_proxy.remote_name = kwargs["__dimos_rpc_name__"]

self._deployed_modules[new_key] = new_proxy

new_bp = new_class.blueprint(**kwargs)
new_atom = new_bp.active_blueprints[0]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Internal framework keys leaked into BlueprintAtom.kwargs via blueprint() call

kwargs is augmented with __dimos_namespace__ and __dimos_rpc_name__ before being passed to new_class.blueprint(**kwargs). Those keys end up stored inside the new BlueprintAtom.kwargs. On the next restart, old_atom.kwargs already carries them, so they accumulate. While ModuleBase.__init__ pops them before passing to super().__init__, any code path that inspects BlueprintAtom.kwargs to reconstruct user-visible configuration will see framework-internal noise. Consider stripping these keys before the blueprint() call and re-injecting them when needed.

Comment on lines +472 to +476
reload_source: bool = True,
) -> None:
with self._modules_lock:
for cls in self._deployed_modules:
if cls.__name__ == class_name:
self._restart_module(cls, reload_source=reload_source)
for key in self._deployed_modules:
if key.module.__name__ == class_name:

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 restart_module_by_class_name is non-deterministic when a class appears in multiple namespaces

The method iterates _deployed_modules and restarts the first key whose module.__name__ matches. Dict iteration order is insertion order in Python 3.7+, so if ModuleA is deployed as both robot1/ModuleA and robot2/ModuleA, the method will always restart whichever was inserted first with no way for the caller to choose the other. There is also no public API to restart a namespaced module by explicit (namespace, class_name) pair.

Comment thread dimos/core/coordination/blueprints.py Outdated
Comment on lines 44 to 48
def get_module(self, name: str) -> Any:
for cls, proxy in self._coordinator._deployed_modules.items():
if cls.__name__ == name:
for key, proxy in self._coordinator._deployed_modules.items():
if key.module.__name__ == name:
return proxy
raise KeyError(name)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 get_module is inconsistent with list_module_names for namespaced deployments

list_module_names() now returns qualified names like "robot1/ModuleA", but get_module still matches only on key.module.__name__ (the bare class name "ModuleA"). Any caller that iterates list_module_names() and then calls get_module() with the returned name will always get a KeyError for namespaced modules. Additionally, when multiple namespaces share the same class (e.g., both robot1/ModuleA and robot2/ModuleA are deployed), the method returns whichever appears first in dict iteration order, silently returning the wrong proxy for any subsequent namespace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant