Skip to content

fix: normalize restored tool results#2320

Open
he-yufeng wants to merge 1 commit into
strands-agents:mainfrom
he-yufeng:fix/session-manager-orphan-tool-results
Open

fix: normalize restored tool results#2320
he-yufeng wants to merge 1 commit into
strands-agents:mainfrom
he-yufeng:fix/session-manager-orphan-tool-results

Conversation

@he-yufeng
Copy link
Copy Markdown
Contributor

Fixes #2296.

The repository session recovery path already fills in missing toolResult blocks, but it kept every existing toolResult in the following user message. If persisted history already contains a stale or duplicate toolResult, the repaired message can end up with more toolResult blocks than the previous assistant toolUse blocks, which Bedrock rejects.

This keeps the recovery narrow: for the user message immediately after an assistant toolUse message, preserve only the first toolResult matching each previous toolUse id, drop unrelated or duplicate toolResults, and then add any missing synthetic error results.

Validation:

  • .venv\Scripts\python.exe -m pytest tests\strands\session\test_repository_session_manager.py::test_fix_broken_tool_use_extends_partial_tool_results tests\strands\session\test_repository_session_manager.py::test_fix_broken_tool_use_drops_extra_tool_results tests\strands\session\test_repository_session_manager.py::test_fix_broken_tool_use_handles_multiple_orphaned_tools -q
  • .venv\Scripts\python.exe -m pytest tests\strands\session\test_repository_session_manager.py -q
  • .venv\Scripts\python.exe -m ruff format --check src\strands\session\repository_session_manager.py tests\strands\session\test_repository_session_manager.py
  • .venv\Scripts\python.exe -m ruff check src\strands\session\repository_session_manager.py tests\strands\session\test_repository_session_manager.py
  • .venv\Scripts\python.exe -m py_compile src\strands\session\repository_session_manager.py tests\strands\session\test_repository_session_manager.py
  • git diff --check

@he-yufeng he-yufeng force-pushed the fix/session-manager-orphan-tool-results branch from 0fada87 to 4dc960a Compare May 27, 2026 09:45
@github-actions github-actions Bot removed the size/s label May 27, 2026
@he-yufeng
Copy link
Copy Markdown
Contributor Author

Rebased this onto current main and revalidated the focused session-manager path. Current head is 4dc960a. Validation from strands-py: uv run pytest tests/strands/session/test_repository_session_manager.py -q --basetemp ...tmp\pytest-2320 -p no:cacheprovider: 42 passed; uv run ruff check src/strands/session/repository_session_manager.py tests/strands/session/test_repository_session_manager.py; uv run ruff format --check src/strands/session/repository_session_manager.py tests/strands/session/test_repository_session_manager.py; python -m py_compile strands-py\src\strands\session\repository_session_manager.py strands-py\tests\strands\session\test_repository_session_manager.py; git diff --check origin/main...HEAD.

@yonib05 yonib05 added area-persistence Session management or checkpointing area-context Session or context related labels May 27, 2026
@gautamsirdeshmukh
Copy link
Copy Markdown
Contributor

/strands review

if missing_tool_use_ids or removed_tool_results:
logger.warning(
"Session message history has an orphaned toolUse with no toolResult. "
"Adding toolResult content blocks to create valid conversation."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The warning message is misleading when only duplicate/stale results are removed but no tool results are actually missing (i.e., removed_tool_results=True and missing_tool_use_ids is empty). The message says "orphaned toolUse with no toolResult" and "Adding toolResult content blocks" but no blocks are added in that scenario.

Suggestion: Consider differentiating the log message based on the actual repair being performed:

if missing_tool_use_ids:
    logger.warning(
        "Session message history has an orphaned toolUse with no toolResult. "
        "Adding toolResult content blocks to create valid conversation."
    )
if removed_tool_results:
    logger.warning(
        "Session message history has duplicate or unrelated toolResult blocks. "
        "Removing extra toolResult content blocks to create valid conversation."
    )

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed with the bot callout, I think specific log statements would be helpful here

assert missing_result["toolResult"]["content"][0]["text"] == "Tool was interrupted."


def test_fix_broken_tool_use_drops_extra_tool_results(session_manager):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Missing docstring. The neighboring tests (test_fix_broken_tool_use_extends_partial_tool_results, test_fix_broken_tool_use_handles_multiple_orphaned_tools) all have docstrings describing what they test. This one should too for consistency.

Suggestion: Add a brief docstring:

def test_fix_broken_tool_use_drops_extra_tool_results(session_manager):
    """Test that stale and duplicate toolResult blocks are removed."""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, docstring would be great

tool_results = [content["toolResult"] for content in fixed_messages[1]["content"]]
assert [tool_result["toolUseId"] for tool_result in tool_results] == ["complete-123", "missing-456"]
assert tool_results[0]["content"] == [{"text": "ok"}]
assert tool_results[1]["status"] == "error"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Consider adding a test for the edge case where removed_tool_results=True but missing_tool_use_ids is empty (i.e., all toolUse IDs have a matching result, but there are also stale/duplicate results to drop). This scenario exercises the if missing_tool_use_ids or removed_tool_results path differently — generate_missing_tool_result_content returns [] and the content is just filtered_content.

Suggestion:

def test_fix_broken_tool_use_drops_stale_results_with_no_missing_ids(session_manager):
    """Test that stale toolResults are removed even when all toolUse IDs are satisfied."""
    messages = [
        {
            "role": "assistant",
            "content": [
                {"toolUse": {"toolUseId": "id-1", "name": "t", "input": {}}},
            ],
        },
        {
            "role": "user",
            "content": [
                {"toolResult": {"toolUseId": "id-1", "status": "success", "content": [{"text": "ok"}]}},
                {"toolResult": {"toolUseId": "stale-999", "status": "error", "content": [{"text": "old"}]}},
            ],
        },
    ]
    fixed = session_manager._fix_broken_tool_use(messages)
    results = [c["toolResult"] for c in fixed[1]["content"]]
    assert len(results) == 1
    assert results[0]["toolUseId"] == "id-1"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the bot again! I think we need this test for sure

@@ -282,25 +282,39 @@ def _fix_broken_tool_use(self, messages: list[Message]) -> list[Message]:
]

# Check if there are more messages after the current toolUse message
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The comment says "Check if there are more messages after the current toolUse message" but that's the old comment that no longer accurately describes what the block below now does (filtering and deduplication).

Suggestion: Update to something like:

# Filter the next message's content: keep only the first toolResult matching each toolUse id,
# drop duplicates and unrelated toolResults.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is definitely not blocking at all but if you are taking care of the others then might as well :)

@github-actions
Copy link
Copy Markdown
Contributor

Assessment: Comment (leans approve)

The fix is well-scoped and correctly addresses the bug: it deduplicates and filters stale toolResult blocks so that the count never exceeds the preceding toolUse blocks. The logic is sound and the new test covers the key scenario.

Review Categories
  • Logging accuracy: The warning message is misleading in the "only duplicates removed, no missing IDs" path — it claims orphaned toolUse and adding blocks when it's actually removing extras.
  • Test coverage: Missing an edge case where all toolUse IDs are satisfied but stale results still need to be dropped (exercises removed_tool_results=True with empty missing_tool_use_ids).
  • Code comments: One inherited comment is stale and no longer describes the new filtering logic.

Clean, focused bugfix — nice work keeping the repair narrow and preserving existing behavior for the happy paths.

@yonib05 yonib05 added python Pull requests that update python code bug Something isn't working labels May 29, 2026
@gautamsirdeshmukh
Copy link
Copy Markdown
Contributor

Hi @he-yufeng , thanks for taking this up – if you can address the 4 callouts above, I am happy to shepherd this PR through!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-context Session or context related area-persistence Session management or checkpointing bug Something isn't working python Pull requests that update python code size/s

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: repository_session_manager adds incorrect number of toolResult blocks when fixing orphaned toolUse

3 participants