Skip to content

bugfix(network): Reset network command id to keep commands in chronological order#2736

Open
Caball009 wants to merge 3 commits into
TheSuperHackers:mainfrom
Caball009:fix_network_commandID_overflow
Open

bugfix(network): Reset network command id to keep commands in chronological order#2736
Caball009 wants to merge 3 commits into
TheSuperHackers:mainfrom
Caball009:fix_network_commandID_overflow

Conversation

@Caball009
Copy link
Copy Markdown

@Caball009 Caball009 commented May 19, 2026

Synchronized network commands are given a unique id so that they can be sorted chronologically. The id needs to be unique per execution frame. It's a uint16 value so it'll overflow quite fast, especially because it's never reset and the starting value is 64000 by default.

When the network runahead decreases it's expected that multiple frames are executed right after each other. I noticed an issue with the overflow when I set the CRC interval to once per frame. When the overflow happened for a client combined with a decrease of the runahead, the order of the CRC messages was no longer guaranteed to be in chronological order for that execution frame. That would cause a mismatch if the overflow didn't happen for all clients at the same time.

With a starting value of 64000 and a CRC message every frame the first overflow happens after ~25 seconds. You can see it mismatch here because of the overflow (this was with a special test build);
https://www.youtube.com/live/V_l-q9Y-DmA?t=621s
https://www.youtube.com/live/V_l-q9Y-DmA?t=9499s

I've added some test code in the first commit. If that's used without this fix, it becomes very apparent that the messages are not in the right order when the command id overflows. On top of that, it's very likely that the disconnect bug (disconnect screen with everyone at ~59XXX msec) occurs frequently.

I intend to do a follow up PR to make the static commandID variable a part of the NetCommandMsg class.

@Caball009 Caball009 added Bug Something is not working right, typically is user facing Major Severity: Minor < Major < Critical < Blocker Network Anything related to network, servers Gen Relates to Generals ZH Relates to Zero Hour labels May 19, 2026
@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch 2 times, most recently from ee6bcd8 to 1f31061 Compare May 31, 2026 18:32
@Caball009 Caball009 marked this pull request as ready for review May 31, 2026 18:46
@Caball009 Caball009 requested a review from Mauller May 31, 2026 18:46
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 31, 2026

Greptile Summary

This PR fixes a determinism bug where the global commandID counter (a uint16) could overflow and wrap around to low values mid-session, causing network commands for the same execution frame to be sorted out of chronological order. The fix resets commandID to zero at the start of each Network::update() cycle, subject to a gate of at most once every MAX_FRAMES_AHEAD (128) frames when the client is actively executing ahead.

  • NetworkUtil.cpp: Promotes commandID from a function-local static to a file-scope static initialized to 0, adds ResetCommandID(), and switches GenerateNextCommandID() from pre-increment to post-increment.
  • Network.cpp: Calls ResetCommandID() in init() for a clean start, adds m_lastFrameResetCommandID to throttle resets to once per MAX_FRAMES_AHEAD frames, and reuses the already-fetched frame local throughout update().
  • NetCommandList.cpp: Adds an informational comment explaining that the sort does not handle overflows and why resetting is preferred over changing the sort for retail compatibility.

Confidence Score: 5/5

Safe to merge; the reset is tightly scoped and both initialization sites keep the counter consistent.

The change is narrow: a periodic ResetCommandID() call in Network::update(), guarded by a 128-frame throttle that only fires when the client is actively executing ahead. The reset is also called unconditionally in init(), so a fresh game always starts at zero. The m_lastFrameResetCommandID member is initialised in both the constructor and init(), preventing any stale-value risk across re-initialisation.

No files require special attention.

Important Files Changed

Filename Overview
Core/GameEngine/Source/GameNetwork/NetworkUtil.cpp Moves commandID to file scope, adds ResetCommandID(), and changes GenerateNextCommandID() to post-increment semantics; straightforward and correct.
Core/GameEngine/Source/GameNetwork/Network.cpp Adds m_lastFrameResetCommandID member and periodic reset logic in update(); the throttle condition and initialization in both constructor and init() are consistent and correct.
Core/GameEngine/Include/GameNetwork/networkutil.h Adds ResetCommandID() declaration; already uses #pragma once; no issues.
Core/GameEngine/Source/GameNetwork/NetCommandList.cpp Adds an informational comment explaining the overflow limitation of the sort; no logic changes.

Sequence Diagram

sequenceDiagram
    participant GL as GameLogic
    participant Net as Network::update()
    participant CID as commandID (static)
    participant CM as ConnectionManager

    Note over Net,CID: Every MAX_FRAMES_AHEAD frames (when executing ahead)
    Net->>CID: "ResetCommandID() → commandID = 0"
    GL->>CM: Game commands queued on TheCommandList
    Net->>CM: GetCommandsFromCommandList()
    CM->>CID: GenerateNextCommandID() → 0, 1, 2 …
    CM-->>Net: Commands tagged with fresh IDs
    Net->>CM: liteupdate() / FRAMEINFO / RUNAHEAD msgs
    CM->>CID: GenerateNextCommandID() → continues monotonically
    Net->>Net: AllCommandsReady(frame)?
    Net->>GL: RelayCommandsToCommandList(frame)
    Note over GL: Commands sorted by (type, playerID, commandID) — all IDs monotone within one frame
Loading

Reviews (4): Last reviewed commit: "Added code to reset command id." | Re-trigger Greptile

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from 1f31061 to ebf1fa2 Compare May 31, 2026 18:59
Comment thread Core/GameEngine/Source/GameNetwork/NetworkUtil.cpp Outdated
@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from ebf1fa2 to a7ff356 Compare June 2, 2026 11:45
@helmutbuhler
Copy link
Copy Markdown

Resetting the command id just delays the potential mismatch, right? Wouldn't it be better to sort the command ids properly, even if that rarely causes mismatches with retail clients?

@Caball009
Copy link
Copy Markdown
Author

Resetting the command id just delays the potential mismatch, right?

It's reset every N frames, so it's not just delayed.

@Caball009 Caball009 force-pushed the fix_network_commandID_overflow branch from a7ff356 to 3ae2622 Compare June 3, 2026 01:29
Copy link
Copy Markdown

@xezon xezon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The id needs to be unique per execution frame.

Can we just reset it to 0 or 1 at the beginning of every new frame? Or does that cause issues? What is the reason for it to reset every 128 frames now?

static UnsignedShort commandID = 64000;
++commandID;
return commandID;
static UnsignedShort commandID = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use s_ prefix for static var.

static UnsignedShort commandID = 64000;
++commandID;
return commandID;
static UnsignedShort commandID = 0;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok to start it at 0 when retail clients still start at 64000 ? I suspect this means every client can make up his own Id numbers as he pleases ?

Any clue why the original author picked 64000 as a start for it ?

#endif

const UnsignedInt frame = TheGameLogic->getFrame();
if (m_localStatus == NETLOCALSTATUS_INGAME) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: empty line here


const UnsignedInt frame = TheGameLogic->getFrame();
if (m_localStatus == NETLOCALSTATUS_INGAME) {
if (frame + m_runAhead > m_lastExecutionFrame) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not understand this condition. Can you explain it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug Something is not working right, typically is user facing Gen Relates to Generals Major Severity: Minor < Major < Critical < Blocker Network Anything related to network, servers ZH Relates to Zero Hour

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants