Skip to content

fix: clarify EndEvent.error semantics for normal non-zero exits in envd process handler#3157

Open
AdaAibaby wants to merge 5 commits into
e2b-dev:mainfrom
AdaAibaby:fix/envdevent-error-semantics-nonzero-exit
Open

fix: clarify EndEvent.error semantics for normal non-zero exits in envd process handler#3157
AdaAibaby wants to merge 5 commits into
e2b-dev:mainfrom
AdaAibaby:fix/envdevent-error-semantics-nonzero-exit

Conversation

@AdaAibaby

@AdaAibaby AdaAibaby commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes #2657

This PR distinguishes normal process exits (including non-zero exits and signal kills) from runtime/infrastructure errors in EndEvent.error.

Problem

exec.Cmd.Wait() in Go returns *exec.ExitError for processes that start successfully but exit with a non-zero status or are killed by a signal. The previous implementation passed this error directly into EndEvent.error, making it impossible for clients to distinguish between:

  1. A process that ran and returned a non-zero exit code (normal behavior)
  2. A process killed by a signal via SendSignal RPC (normal behavior)
  3. An actual runtime/infrastructure failure (e.g., failure to start or wait for the process)

Changes

packages/envd/internal/services/process/handler/handler.go

  • Added *exec.ExitError check: when cmd.Wait() returns an *exec.ExitError and the process has a valid ProcessState, the error is not propagated to EndEvent.error
  • Normal non-zero exits and signal kills are now represented solely by exit_code, exited, and status fields
  • True runtime errors (e.g., wait failures) still populate EndEvent.error
  • Added nil guard for ProcessState to prevent potential nil pointer dereference

packages/envd/spec/process/process.proto

  • Added documentation comments to all EndEvent fields clarifying their intended semantics
  • Explicitly documents that error is for runtime/infrastructure-level failures, not normal process exits

Generated files

  • Updated packages/envd/internal/services/spec/process/process.pb.go and packages/shared/pkg/grpc/envd/process/process.pb.go with the new field comments

Behavior Change

Scenario Before After
exit 0 exit_code=0, error=nil exit_code=0, error=nil (unchanged)
exit 1 exit_code=1, error="exit status 1" exit_code=1, error=nil
SIGKILL exit_code=-1, error="signal: killed" exit_code=-1, error=nil
SIGTERM exit_code=-1, error="signal: terminated" exit_code=-1, error=nil
Runtime failure error="..." error="..." (unchanged)

Compatibility Note

This is a behavior change for clients that currently rely on EndEvent.error being set for non-zero exits or signal kills. Clients should use exit_code, exited, and status to inspect process results instead.

…t.error

Previously, envd set EndEvent.error for all non-zero process exits because
exec.Cmd.Wait() returns *exec.ExitError for normal non-zero exits. This made
it impossible for clients to distinguish between a process that ran and
returned a non-zero exit code vs an actual runtime/infrastructure failure.

Now, EndEvent.error is only set for true runtime errors (e.g., failure to
wait for the process). Normal non-zero exits are represented solely by
exit_code, exited, and status fields.

Also added documentation comments to process.proto clarifying the intended
semantics of each EndEvent field.

Fixes e2b-dev#2657

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

While the nil check on p.cmd.ProcessState prevents a panic within the conditional block, p.cmd.ProcessState can still be nil if the process failed to start or wait failed with a non-ExitError. In such cases, subsequent lines will attempt to call methods on the nil ProcessState pointer, resulting in a nil pointer dereference panic. The fields of EndEvent must be safely populated only when p.cmd.ProcessState is non-nil.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread packages/envd/internal/services/process/handler/handler.go Outdated
If the process failed to start or Wait returned a non-ExitError,
ProcessState can be nil. Guard the ExitCode/Exited/Status field
access to prevent a nil pointer dereference.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 50f37204d6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/envd/spec/process/process.proto
Sync the generated Go protobuf files with the updated proto field
comments documenting EndEvent semantics.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5904304bdc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread packages/envd/internal/services/process/handler/handler.go Outdated
adababys and others added 2 commits July 1, 2026 14:49
When a process is killed by a signal (e.g., via SendSignal RPC),
cmd.Wait() returns *exec.ExitError with Exited()==false. This is
still a normal process result, not a runtime/infrastructure error.

Remove the Exited() check so that all *exec.ExitError cases with
a valid ProcessState are treated as process results. Signal kills
are described by status ('signal: killed') and exited (false).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clarify EndEvent.error semantics for normal non-zero exits in envd's process handler

2 participants