Use event instead of termination callback#40767
Conversation
There was a problem hiding this comment.
Pull request overview
This PR reworks how WSLC session termination is exposed to consumers by replacing the session-creation-time push callback (ITerminationCallback) with a pull model: a one-off termination event plus a cached termination reason that can be queried by any holder of an IWSLCSession (including when opening existing sessions).
Changes:
- Adds termination event + termination reason APIs to the runtime/session and service VM layers, and updates session state tracking to use the terminated event.
- Updates the public SDK surface to expose
WslcGetSessionTerminationEvent/WslcGetSessionTerminationReason, and removes termination-callback plumbing (code, exports, and MSI proxy/stub registration). - Updates WSLC runtime + SDK tests to validate event signaling, handle lifetime, and reason availability semantics.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/windows/WSLCTests.cpp | Replaces callback-based termination test with event + reason polling/verification. |
| test/windows/WslcSdkTests.cpp | Updates SDK tests to wait on the returned termination event and query termination reason. |
| src/windows/wslcsession/WSLCVirtualMachine.h | Adds a helper to fetch cached termination reason/details from the underlying VM. |
| src/windows/wslcsession/WSLCSession.h | Adds terminated event + cached termination reason/details; removes m_terminated. |
| src/windows/wslcsession/WSLCSession.cpp | Implements GetTerminationEvent/GetTerminationReason, populates cached reason during teardown, and signals terminated event at completion. |
| src/windows/WslcSDK/WslcsdkPrivate.h | Shrinks internal session options; removes stored termination callback COM pointer. |
| src/windows/WslcSDK/wslcsdk.h | Updates public SDK API: removes termination callback API, adds termination event + reason APIs, adjusts opaque options size. |
| src/windows/WslcSDK/wslcsdk.def | Updates exports to remove callback setter and add termination event/reason getters. |
| src/windows/WslcSDK/wslcsdk.cpp | Removes callback wiring during session creation; implements WslcGetSessionTerminationEvent and WslcGetSessionTerminationReason. |
| src/windows/WslcSDK/TerminationCallback.h | Removed (callback mechanism deleted). |
| src/windows/WslcSDK/TerminationCallback.cpp | Removed (callback mechanism deleted). |
| src/windows/WslcSDK/CMakeLists.txt | Removes TerminationCallback sources/headers from the SDK build. |
| src/windows/service/inc/wslc.idl | Removes ITerminationCallback and adds new termination event/reason methods to session + VM interfaces. |
| src/windows/service/exe/HcsVirtualMachine.h | Adds cached termination reason/details members and declares GetTerminationReason. |
| src/windows/service/exe/HcsVirtualMachine.cpp | Caches termination reason/details on exit before signaling the exit event; implements GetTerminationReason. |
| msipackage/package.wix.in | Removes COM registration for ITerminationCallback interface. |
OneBlue
left a comment
There was a problem hiding this comment.
Looks like there are some build errors, but overall design LGTM
| // Returns the cached termination reason and details. The values are only meaningful | ||
| // after the termination event has been signaled; before that the reason is | ||
| // WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string. | ||
| HRESULT GetTerminationReason([out] WSLCVirtualMachineTerminationReason* Reason, [out] LPWSTR* Details); |
There was a problem hiding this comment.
Long term I think we should parse the Details into fields so the caller doesn't have to deal with HCS's json format.
But I think this is fine for now, since this is already a major improvement on the previous design
There was a problem hiding this comment.
I started making the change, but it ends up adding a bunch of new structs and touches a lot of files, so, I think will do this in a follow up PR. I'm confident I can get it in before private preview.
PR #40767 (terminateEvent) removed ITerminationCallback / WSLCSessionSettings.TerminationCallback from the IDL but left the WSLCVirtualMachineFactory (introduced later by factory PR #40770, which its branch merged from master) still referencing them, so the branch did not compile. Drop the dead member and assignments; the VM now caches the reason for WSLCSession to pull. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Kevin's PR #40767 (event-model termination) moved m_vmExitEvent.SetEvent() in OnExit() to after a new std::lock_guard(m_lock) that caches the termination reason. ~HcsVirtualMachine already holds m_lock across the 5s exit-event wait and HcsCloseComputeSystem (which drains in-flight HCS callbacks). An in-flight OnExit() therefore blocks acquiring m_lock, so it never signals the exit event nor drains, and the close never completes: a hard deadlock that StuckVmTermination reliably reproduces. Drop the broad lock from the dtor. By the time the compute system is closed no further callbacks can run, so the remaining teardown is safe unguarded. Flag to Kevin to fold into #40767. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR #40767 (terminateEvent) removed ITerminationCallback / WSLCSessionSettings.TerminationCallback from the IDL but left the WSLCVirtualMachineFactory (introduced later by factory PR #40770, which its branch merged from master) still referencing them, so the branch did not compile. Drop the dead member and assignments; the VM now caches the reason for WSLCSession to pull. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Kevin's PR #40767 (event-model termination) moved m_vmExitEvent.SetEvent() in OnExit() to after a new std::lock_guard(m_lock) that caches the termination reason. ~HcsVirtualMachine already holds m_lock across the 5s exit-event wait and HcsCloseComputeSystem (which drains in-flight HCS callbacks). An in-flight OnExit() therefore blocks acquiring m_lock, so it never signals the exit event nor drains, and the close never completes: a hard deadlock that StuckVmTermination reliably reproduces. Drop the broad lock from the dtor. By the time the compute system is closed no further callbacks can run, so the remaining teardown is safe unguarded. Flag to Kevin to fold into #40767. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| // Returns the cached termination reason and details. The values are only meaningful after the | ||
| // termination event has been signaled; before that the reason is | ||
| // WSLCVirtualMachineTerminationReasonUnknown and Details is an empty string. |
| interface IWSLCSession : IUnknown | ||
| { | ||
| HRESULT GetId([out] ULONG* Id); | ||
| HRESULT GetState([out] WSLCSessionState* State); | ||
|
|
||
| // Returns a one-off event that is signaled when the session terminates, whether due to an | ||
| // explicit Terminate() call or an unexpected VM exit. The returned handle is owned by the | ||
| // caller and remains valid (and observes the signaled state) even after the session is released. | ||
| HRESULT GetTerminationEvent([out, system_handle(sh_event)] HANDLE* Event); | ||
|
|
||
| // Returns the cached termination reason and details. These are only available after the | ||
| // termination event has been signaled; before that the call fails. | ||
| HRESULT GetTerminationReason([out] WSLCVirtualMachineTerminationReason* Reason, [out] LPWSTR* Details); |
| // Returns an event that is signaled when the VM exits (graceful or forced). | ||
| HRESULT GetTerminationEvent([out, system_handle(sh_event)] HANDLE* Event); | ||
|
|
||
| // Returns the cached termination reason and details. These are only available after the | ||
| // termination event has been signaled; before that the call fails. | ||
| HRESULT GetTerminationReason([out] WSLCVirtualMachineTerminationReason* Reason, [out] LPWSTR* Details); |
| auto session = static_cast<Session*>(context); | ||
|
|
||
| WslcSessionTerminationReason reason = WSLC_SESSION_TERMINATION_REASON_UNKNOWN; | ||
| LOG_IF_FAILED(WslcGetSessionTerminationReason(session->m_session.get(), &reason)); | ||
|
|
||
| session->m_terminatedEvent(static_cast<SessionTerminationReason>(reason)); | ||
| } |
There was a problem hiding this comment.
This is not true because Session's destructor will destroy m_terminationWait before releasing the session
Summary of the Pull Request
Reworks how WSLC session termination is surfaced to clients. The previous design used a push-style
ITerminationCallbackthat could only be subscribed at session-creation time (viaWSLCSessionSettings), so it was unavailable when opening an existing session. This replaces it with a pull model that mirrors the process-exit-event pattern: a one-off "terminated" event plus a cached termination reason that any holder of anIWSLCSessioncan query — whether the session was created or opened.PR Checklist
Detailed Description of the Pull Request / Additional comments
Two distinct lifecycle events on the session:
m_sessionTerminatingEvent(existing) — signaled at the start of teardown to cancel in-flight operations.m_sessionTerminatedEvent(new) — signaled once teardown is complete ("the session is done"), via either an explicitTerminate()or an unexpected VM exit. This is the event handed to clients.Runtime (service / per-user session):
IWSLCSessiongainsGetTerminationEvent(returns a duplicated, caller-owned handle to the terminated event that stays valid after the session is released) andGetTerminationReason(returns the cached reason + details; fails withERROR_INVALID_STATEuntil the session has actually terminated, so callers can distinguish "still running" from "terminated for an unknown reason").HcsVirtualMachine(service) computes the termination reason from the HCS exit status inOnExitand caches it;IWSLCVirtualMachinegainsGetTerminationReason.WSLCSession::Terminate()captures the reason once: it defaults toShutdownfor an explicit terminate (VM still alive = we shut it down), and pulls the real crash/shutdown reason from the VM when the VM exited on its own. Capture happens under the exclusive lock before the VM reference is released, and the terminated event is signaled last so any observer sees a fully torn-down session.m_terminatedbool removed —m_sessionTerminatedEvent.is_signaled()is now the single source of truth forGetState/GetTerminationReason.SDK (
wslcsdk):WslcSetSessionSettingsTerminationCallbackand theITerminationCallback/TerminationCallbackplumbing (files, CMake,.def, WiX proxy/stub registration).WslcGetSessionTerminationEventandWslcGetSessionTerminationReason.WslcSessionOptionsInternalshrank (removed callback + context fields); opaqueWSLC_SESSION_OPTIONS_SIZEupdated accordingly.IDL: removed the
ITerminationCallbackinterface and theTerminationCallbackfield fromWSLCSessionSettings; added the new methods toIWSLCSessionandIWSLCVirtualMachine.Validation Steps Performed
WSLCTests.cpp::TerminationEventwaits onIWSLCSession::GetTerminationEvent, verifiesGetTerminationReasonreturnsERROR_INVALID_STATEbefore termination andShutdownafter an explicitTerminate().WslcSdkTests.cpp::TerminationEventViaTerminateandTerminationEventViaReleasewait on the SDK-returned event (including verifying the handle stays valid after the session is released) and check the reason viaWslcGetSessionTerminationReason.