LCORE-1830: Implement Question Validity Safety Capability in Pydantic AI#1913
LCORE-1830: Implement Question Validity Safety Capability in Pydantic AI#1913Jazzcort wants to merge 1 commit into
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughA new ChangesQuestion Validity Capability
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/__init__.py`:
- Line 8: Replace the relative re-exports with absolute imports: in
src/pydantic_ai_lightspeed/capabilities/__init__.py change the relative import
of QuestionValidity to the package-absolute path (importing QuestionValidity
from pydantic_ai_lightspeed.capabilities.question_validity), and apply the same
pattern in src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py
for any internal re-exports; update the import statements that currently use the
“from .module import Name” form to “from
pydantic_ai_lightspeed.capabilities.module import Name” so they follow the
project’s absolute-import guideline.
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py`:
- Line 13: Replace usage of the stdlib logger with the project logger factory:
remove the "import logging" and instead import get_logger from the project's log
helper and create a module-level logger via logger = get_logger(__name__)
(replace any logging.getLogger(...) occurrences in this module, e.g., where the
current logger is referenced at the top-level and around the symbol that
initializes the logger on line ~25). Ensure all subsequent logger calls use this
module-level logger variable.
- Line 71: Add Google-style docstrings and complete type annotations for all
functions in this module: specifically document
_extract_message_str_from_user_content with a short description, Args, Returns
and Raises sections and any other module functions referenced near the top and
middle of the file; explicitly annotate __post_init__ with -> None and add a
Google-style docstring describing its post-initialization behavior; ensure every
function/method has full parameter and return type hints and descriptive
docstrings (follow Google Python docstring conventions).
- Line 152: The guardrail comparison currently does an exact match (result.text
== SUBJECT_ALLOWED) which fails on casing/whitespace variants; change the check
to normalize the classifier output first (e.g., normalized =
result.text.strip().upper()) and compare normalized against SUBJECT_ALLOWED and
SUBJECT_REJECTED (or compare against normalized versions of those constants)
inside the same function/method where result.text is used; also add unit tests
in TestWrapRun exercising variants like "allowed", " ALLOWED\n", and mixed-case
to ensure normalization works.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5d0b2f52-fac2-4f56-8e38-bc92319852c6
📒 Files selected for processing (6)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py
🔇 Additional comments (3)
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py (1)
1-520: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-2: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-2: LGTM!
| # Include token usage from the question validity request | ||
| ctx.usage.incr(result.usage) | ||
|
|
||
| if result.text == SUBJECT_ALLOWED: |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify guardrail tests currently cover normalization behavior.
rg -n -C2 'SUBJECT_ALLOWED|result\.text|wrap_run' tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.pyRepository: lightspeed-core/lightspeed-stack
Length of output: 4763
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show the exact comparison at the referenced line range
python - <<'PY'
from pathlib import Path
path = Path("src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py")
lines = path.read_text().splitlines()
start = 130
end = 175
for i in range(start, min(end, len(lines))):
print(f"{i+1:4d}: {lines[i]}")
PY
# Check tests for normalization/whitespace/casing expectations
rg -n "SUBJECT_ALLOWED|SUBJECT_REJECTED|strip\(|upper\(|lower\(|whitespace|casefold|normalize" \
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.pyRepository: lightspeed-core/lightspeed-stack
Length of output: 2123
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# 1) Inspect the wrap_run tests area
python - <<'PY'
from pathlib import Path
path = Path("tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py")
lines = path.read_text().splitlines()
# Print a focused window around the wrap_run tests
start = 240
end = 560
for i in range(start, min(end, len(lines))):
print(f"{i+1:4d}: {lines[i]}")
PY
# 2) Search for any normalization on result.text in this capability
rg -n "result\.text|strip\(|upper\(|lower\(|casefold\(|normalize" \
src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py src/pydantic_ai_lightspeed/capabilities/question_validity -S
# 3) Search the tests for any variants (e.g., whitespace/case) around SUBJECT_ALLOWED/REJECTED
rg -n "TextPart\(content=\s*['\"]\s*(SUBJECT_ALLOWED|SUBJECT_REJECTED)" tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
rg -n "SUBJECT_ALLOWED\s*(\)|\]|,)|SUBJECT_REJECTED\s*(\)|\]|,)" tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.pyRepository: lightspeed-core/lightspeed-stack
Length of output: 12090
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Locate model_request implementation (within this capability if present)
rg -n "def model_request|async def model_request|model_request\s*\(" -S src/pydantic_ai_lightspeed/capabilities/question_validity
# Show the top of _capacity.py to understand how result.text is produced/used
python - <<'PY'
from pathlib import Path
path = Path("src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py")
lines = path.read_text().splitlines()
for i in range(1, min(220, len(lines))+1):
if i<=40 or (140<=i<=180):
print(f"{i:4d}: {lines[i-1]}")
PY
# Search for other SUBJECT_ALLOWED comparisons
rg -n "result\.text\s*==\s*SUBJECT_ALLOWED|SUBJECT_ALLOWED\s*in\s*result\.text|SUBJECT_ALLOWED" \
src/pydantic_ai_lightspeed/capabilities/question_validity -S
# Search tests for any normalization or casing/whitespace cases
rg -n "SUBJECT_ALLOWED.*(strip|upper|lower|casefold)|SUBJECT_REJECTED.*(strip|upper|lower|casefold)" \
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.pyRepository: lightspeed-core/lightspeed-stack
Length of output: 3233
Normalize guardrail classifier output before ALLOWED/REJECTED comparison.
src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py line 152 does an exact result.text == SUBJECT_ALLOWED match, so casing/whitespace variants (e.g., "allowed", " ALLOWED\n") would be treated as rejected. Current TestWrapRun unit tests only exercise exact SUBJECT_ALLOWED/SUBJECT_REJECTED strings (and a non-matching string), with no coverage for whitespace/case normalization.
Suggested fix
- if result.text == SUBJECT_ALLOWED:
+ classification = result.text.strip().upper()
+ if classification == SUBJECT_ALLOWED:
return await handler() # proceed with the real run
else:
# short-circuit: return the rejection message with shield usage tracked
state = GraphAgentState(usage=ctx.usage)
return AgentRunResult(output=self.invalid_question_response, _state=state)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py` at
line 152, The guardrail comparison currently does an exact match (result.text ==
SUBJECT_ALLOWED) which fails on casing/whitespace variants; change the check to
normalize the classifier output first (e.g., normalized =
result.text.strip().upper()) and compare normalized against SUBJECT_ALLOWED and
SUBJECT_REJECTED (or compare against normalized versions of those constants)
inside the same function/method where result.text is used; also add unit tests
in TestWrapRun exercising variants like "allowed", " ALLOWED\n", and mixed-case
to ensure normalization works.
Jazzcort
left a comment
There was a problem hiding this comment.
The comment is just some bugs that I found when I tried to wire up this question validity shield with pydantic agent. 😁
| return "\n".join(str_arr) | ||
|
|
||
|
|
||
| def _remove_conversation_from_settings(model: Model) -> Model: |
There was a problem hiding this comment.
During my testing, if we share Model either through passing the same Model instance as what is passed to Agent in build_agent or grabbing it from ctx.model, the request made by question validity shield contaminate the stored conversation history in Llama Stack. Therefore, the solutions that I can think of are either we copy the Model instance with "conversation" field removed from "extra_body" like this or we pass a separate Model instance without "conversation" field in "extra_body" when we create this capability in build_agent.
There was a problem hiding this comment.
Model used by question validity shield should be specified in lightspeed stack config not inherited from query request. Therefore, there should always be a separate model instance for the shield (see my other comments). Do you think that this will help with the interference?
There was a problem hiding this comment.
or we pass a separate Model instance without "conversation" field in "extra_body" when we create this capability in build_agent.
If we're passing another model instance for this then we don't need this actually. I first came up with this because I don't know what will be passing when this gets created and I guess I was just trying to do some defensive programming there. 😆 Anyway, I will remove this and its unit test since we don't need it 😁
There was a problem hiding this comment.
If you look at the config I linked below you see that there is a model_id. So just use that and create separate model instance ;)
3700804 to
9599ca5
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py (2)
158-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize the classifier output before the ALLOWED check.
Line 158 does an exact sentinel match, so harmless casing or whitespace differences from the model will incorrectly short-circuit valid questions as rejected.
[suggested fix]
Minimal fix
- if result.text == SUBJECT_ALLOWED: + classification = result.text.strip().upper() + if classification == SUBJECT_ALLOWED: return await handler() # proceed with the real run else: # short-circuit: return the rejection message with shield usage tracked state = GraphAgentState(usage=ctx.usage) return AgentRunResult(output=self.invalid_question_response, _state=state)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py` at line 158, Normalize the classifier output before comparing to the sentinel: ensure result.text is non-null, strip surrounding whitespace and case-normalize it (e.g., lower()) and compare that normalized string to a similarly normalized SUBJECT_ALLOWED (or normalize SUBJECT_ALLOWED once). Update the check around result.text == SUBJECT_ALLOWED in the capacity logic (the result.text comparison in _capacity.py) to use the normalized value so harmless casing/whitespace differences won’t short-circuit valid questions.
72-89: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Google-style docstrings to the helpers and capability methods, and update the class description.
_extract_message_str_from_user_content,_remove_conversation_from_settings,__post_init__,_build_prompt, andwrap_rundo not meet the repository’s required docstring format, and theQuestionValidityclass docstring still describes a generic boolean-returning guard instead of this capability’s actual short-circuit behavior.As per coding guidelines,
src/**/*.py: “All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings” and “Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes”.Also applies to: 105-118, 129-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py` around lines 72 - 89, Update the docstrings and type annotations to follow repository standards: add Google-style docstrings (with Parameters, Returns, Raises as appropriate) to the helper functions _extract_message_str_from_user_content(user_content: Sequence[UserContent]) -> str and _remove_conversation_from_settings(model: Model) -> Model, and to the methods __post_init__(self) -> None, _build_prompt(self, ...) -> str, and wrap_run(self, ...) -> Any (use PEP 604 union syntax where needed, e.g., str | None), ensuring all parameter and return types are fully annotated; also revise the QuestionValidity class docstring to accurately describe its short-circuit behavior (not a generic boolean guard) and include an Attributes section for key fields. Ensure docstrings succinctly document parameters, return values, raised exceptions, and any side effects without changing logic.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py`:
- Line 158: Normalize the classifier output before comparing to the sentinel:
ensure result.text is non-null, strip surrounding whitespace and case-normalize
it (e.g., lower()) and compare that normalized string to a similarly normalized
SUBJECT_ALLOWED (or normalize SUBJECT_ALLOWED once). Update the check around
result.text == SUBJECT_ALLOWED in the capacity logic (the result.text comparison
in _capacity.py) to use the normalized value so harmless casing/whitespace
differences won’t short-circuit valid questions.
- Around line 72-89: Update the docstrings and type annotations to follow
repository standards: add Google-style docstrings (with Parameters, Returns,
Raises as appropriate) to the helper functions
_extract_message_str_from_user_content(user_content: Sequence[UserContent]) ->
str and _remove_conversation_from_settings(model: Model) -> Model, and to the
methods __post_init__(self) -> None, _build_prompt(self, ...) -> str, and
wrap_run(self, ...) -> Any (use PEP 604 union syntax where needed, e.g., str |
None), ensuring all parameter and return types are fully annotated; also revise
the QuestionValidity class docstring to accurately describe its short-circuit
behavior (not a generic boolean guard) and include an Attributes section for key
fields. Ensure docstrings succinctly document parameters, return values, raised
exceptions, and any side effects without changing logic.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 0c6cff2a-26dd-4d38-bff2-bdac045dfd76
📒 Files selected for processing (6)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.py
🔇 Additional comments (6)
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py (1)
1-520: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-1: LGTM!src/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-10: LGTM!src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-7: LGTM!src/pydantic_ai_lightspeed/capabilities/question_validity/_capacity.py (1)
1-69: LGTM!
jrobertboos
left a comment
There was a problem hiding this comment.
Instead of naming the files capacity they should be capability.
|
@jrobertboos My bad! Thanks for catching that! 😁 |
9599ca5 to
7e60d1c
Compare
|
Changed! @jrobertboos Thanks for reviewing! |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (7)
src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py (7)
129-130: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd docstring to
__post_init__method.Project guidelines require all functions to have descriptive docstrings.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 129 - 130, The __post_init__ method lacks a docstring; add a concise docstring above def __post_init__(self) describing its purpose (e.g., that it sanitizes/normalizes the model settings by removing conversation-related keys), mention that it mutates self.model and calls _remove_conversation_from_settings(self.model), and note expected behavior/return type (None) and any side effects so it complies with project docstring guidelines.Source: Coding guidelines
72-82: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Google-style docstring.
The function lacks the required Args, Returns, and Raises sections specified in project guidelines.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 72 - 82, The function _extract_message_str_from_user_content is missing a Google-style docstring; update its docstring to include a short summary plus Args (describe user_content: Sequence[UserContent]), Returns (str: combined text joined by newlines), and Raises (document that no exceptions are raised or note specific exceptions if any can propagate), following the project's Google-style format; reference UserContent and TextContent in the description so readers know which variants are handled and keep the content concise and accurate.Source: Coding guidelines
145-163: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Google-style docstring to
wrap_runmethod.Project guidelines require all functions to have descriptive docstrings with Args, Returns, and Raises sections.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 145 - 163, Add a Google-style docstring to the async method wrap_run that describes its purpose and behaviour (it builds a prompt, calls model_request, increments ctx.usage with the result usage, and either calls the provided handler to continue or short-circuits with an invalid response), and include Args (ctx: RunContext, handler: WrapRunHandler), Returns (AgentRunResult) and Raises (any exceptions propagated from model_request or handler) sections; reference the side-effect of ctx.usage.incr(result.usage) and mention the returned GraphAgentState when short-circuiting.Source: Coding guidelines
132-143: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Google-style docstring to
_build_promptmethod.Project guidelines require all functions to have descriptive docstrings with Args, Returns, and Raises sections.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 132 - 143, Add a Google-style docstring to the _build_prompt method describing its purpose, parameters, return value, and potential errors: explain the message parameter (type: str | Sequence[UserContent] | None) and that it may be converted via _extract_message_str_from_user_content, state that it returns a str (the prompt created by substituting message, allowed=SUBJECT_ALLOWED, rejected=SUBJECT_REJECTED into self.model_prompt using Template), and document any Raises (e.g., Template-related errors or TypeError if message contains unsupported types). Place this docstring immediately above the def _build_prompt declaration and reference Template, self.model_prompt, and _extract_message_str_from_user_content in the description.Source: Coding guidelines
158-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize classifier output before comparison.
The exact match
result.text == SUBJECT_ALLOWEDwill fail if the LLM returns whitespace variants (e.g.," ALLOWED\n") or different casing (e.g.,"allowed"), causing valid inputs to be incorrectly rejected.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` at line 158, The equality check using `if result.text == SUBJECT_ALLOWED` should normalize the classifier output to avoid failing on whitespace or case differences; update the condition in `_capability.py` (where `result.text` is compared to `SUBJECT_ALLOWED`) to compare a normalized form such as `result.text.strip().casefold()` against the normalized `SUBJECT_ALLOWED` (e.g., `SUBJECT_ALLOWED.casefold()`) so trailing/leading whitespace and casing differences no longer cause false negatives.
85-101: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Google-style docstring.
The function lacks the required Args, Returns, and Raises sections specified in project guidelines.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 85 - 101, The function _remove_conversation_from_settings lacks a Google-style docstring with Args, Returns, and Raises sections; add a docstring above the function following Google style that documents: Args (model: Model) — describe that a Model is accepted and that its settings may be mutated/copied; Returns — a Model instance (either the original or a shallow-copied Model with "conversation" removed from settings["extra_body"]); Raises — any exceptions that could be propagated (e.g., TypeError if model.settings is not a mapping) or state that no exceptions are raised under normal operation. Reference the function name _remove_conversation_from_settings and the Model type in the docstring so readers can locate and understand the behavior.Source: Coding guidelines
106-118: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd Attributes section to class docstring.
Project guidelines require class docstrings to include an Attributes section describing the class fields.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 106 - 118, Update the class docstring for QuestionValidity to include an Attributes section that lists each class field (name and type), a one-line description of its purpose, and default value if any; locate the docstring for the QuestionValidity class in src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py and add entries for all attributes defined on the class (e.g., model, any guard or config fields, etc.), following the project's docstring style and formatting.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`:
- Line 99: The code is directly mutating the private attribute _model._settings;
instead, avoid touching private internals—use the Model's public API to produce
a modified instance (e.g., a copy/with/replace_settings method on the Model) and
set the new settings there, or if such an API is unavailable, create a shallow
clone of _model with the updated settings via its public constructor or copy
method and use that; if you must mutate the private field as a last resort, add
a clear comment documenting why and the risk and wrap the mutation in a single
well-scoped helper function (referencing _model and _settings to locate the
change).
- Line 32: Update the prompt string in _capability.py that currently reads "Your
job is to determine where or a user's question is related to kubernetes and/or
openshift technologies and to provide a one-word response." to fix the typo by
replacing "where or" with "whether" so it reads "Your job is to determine
whether a user's question is related to kubernetes and/or openshift technologies
and to provide a one-word response."; locate the prompt constant or variable
containing that text in the Question Validity capability and perform the
substitution.
In
`@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py`:
- Around line 67-70: Update the test docstring for
test_sequence_with_non_text_content to accurately describe its behavior: state
that it verifies non-TextContent items (e.g., ImageUrl("fake.png")) are filtered
out and text items are preserved when calling
_extract_message_str_from_user_content, rather than claiming it tests a
"single-element sequence"; reference the test function name
test_sequence_with_non_text_content and the helper
_extract_message_str_from_user_content in the updated docstring for clarity.
---
Duplicate comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`:
- Around line 129-130: The __post_init__ method lacks a docstring; add a concise
docstring above def __post_init__(self) describing its purpose (e.g., that it
sanitizes/normalizes the model settings by removing conversation-related keys),
mention that it mutates self.model and calls
_remove_conversation_from_settings(self.model), and note expected
behavior/return type (None) and any side effects so it complies with project
docstring guidelines.
- Around line 72-82: The function _extract_message_str_from_user_content is
missing a Google-style docstring; update its docstring to include a short
summary plus Args (describe user_content: Sequence[UserContent]), Returns (str:
combined text joined by newlines), and Raises (document that no exceptions are
raised or note specific exceptions if any can propagate), following the
project's Google-style format; reference UserContent and TextContent in the
description so readers know which variants are handled and keep the content
concise and accurate.
- Around line 145-163: Add a Google-style docstring to the async method wrap_run
that describes its purpose and behaviour (it builds a prompt, calls
model_request, increments ctx.usage with the result usage, and either calls the
provided handler to continue or short-circuits with an invalid response), and
include Args (ctx: RunContext, handler: WrapRunHandler), Returns
(AgentRunResult) and Raises (any exceptions propagated from model_request or
handler) sections; reference the side-effect of ctx.usage.incr(result.usage) and
mention the returned GraphAgentState when short-circuiting.
- Around line 132-143: Add a Google-style docstring to the _build_prompt method
describing its purpose, parameters, return value, and potential errors: explain
the message parameter (type: str | Sequence[UserContent] | None) and that it may
be converted via _extract_message_str_from_user_content, state that it returns a
str (the prompt created by substituting message, allowed=SUBJECT_ALLOWED,
rejected=SUBJECT_REJECTED into self.model_prompt using Template), and document
any Raises (e.g., Template-related errors or TypeError if message contains
unsupported types). Place this docstring immediately above the def _build_prompt
declaration and reference Template, self.model_prompt, and
_extract_message_str_from_user_content in the description.
- Line 158: The equality check using `if result.text == SUBJECT_ALLOWED` should
normalize the classifier output to avoid failing on whitespace or case
differences; update the condition in `_capability.py` (where `result.text` is
compared to `SUBJECT_ALLOWED`) to compare a normalized form such as
`result.text.strip().casefold()` against the normalized `SUBJECT_ALLOWED` (e.g.,
`SUBJECT_ALLOWED.casefold()`) so trailing/leading whitespace and casing
differences no longer cause false negatives.
- Around line 85-101: The function _remove_conversation_from_settings lacks a
Google-style docstring with Args, Returns, and Raises sections; add a docstring
above the function following Google style that documents: Args (model: Model) —
describe that a Model is accepted and that its settings may be mutated/copied;
Returns — a Model instance (either the original or a shallow-copied Model with
"conversation" removed from settings["extra_body"]); Raises — any exceptions
that could be propagated (e.g., TypeError if model.settings is not a mapping) or
state that no exceptions are raised under normal operation. Reference the
function name _remove_conversation_from_settings and the Model type in the
docstring so readers can locate and understand the behavior.
- Around line 106-118: Update the class docstring for QuestionValidity to
include an Attributes section that lists each class field (name and type), a
one-line description of its purpose, and default value if any; locate the
docstring for the QuestionValidity class in
src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py and add
entries for all attributes defined on the class (e.g., model, any guard or
config fields, etc.), following the project's docstring style and formatting.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9931376f-ec7f-4429-9dd3-3bab0bcdca00
📒 Files selected for processing (6)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: E2E Tests for Lightspeed Evaluation job
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 2
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py
🔇 Additional comments (11)
src/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-10: LGTM!src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-7: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py (7)
1-22: LGTM!
24-34: LGTM!
39-66: LGTM!
73-143: LGTM!
145-189: LGTM!
191-242: LGTM!
245-519: LGTM!
| def test_sequence_with_non_text_content(self) -> None: | ||
| """Test extraction from a single-element sequence.""" | ||
| result = _extract_message_str_from_user_content([ImageUrl("fake.png"), "keep"]) | ||
| assert result == "keep" |
There was a problem hiding this comment.
Fix the test docstring to match the test behavior.
The docstring says "Test extraction from a single-element sequence." but the test actually validates that non-TextContent items (like ImageUrl) are filtered out while text content is preserved. This appears to be a copy-paste error from the previous test.
📝 Suggested docstring fix
- def test_sequence_with_non_text_content(self) -> None:
- """Test extraction from a single-element sequence."""
+ def test_sequence_with_non_text_content(self) -> None:
+ """Test that non-TextContent items are skipped during extraction."""
result = _extract_message_str_from_user_content([ImageUrl("fake.png"), "keep"])
assert result == "keep"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def test_sequence_with_non_text_content(self) -> None: | |
| """Test extraction from a single-element sequence.""" | |
| result = _extract_message_str_from_user_content([ImageUrl("fake.png"), "keep"]) | |
| assert result == "keep" | |
| def test_sequence_with_non_text_content(self) -> None: | |
| """Test that non-TextContent items are skipped during extraction.""" | |
| result = _extract_message_str_from_user_content([ImageUrl("fake.png"), "keep"]) | |
| assert result == "keep" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capacity.py`
around lines 67 - 70, Update the test docstring for
test_sequence_with_non_text_content to accurately describe its behavior: state
that it verifies non-TextContent items (e.g., ImageUrl("fake.png")) are filtered
out and text items are preserved when calling
_extract_message_str_from_user_content, rather than claiming it tests a
"single-element sequence"; reference the test function name
test_sequence_with_non_text_content and the helper
_extract_message_str_from_user_content in the updated docstring for clarity.
7e60d1c to
2b367b1
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`:
- Around line 106-109: Update the class docstring starting at line 106 to
accurately describe the actual runtime behavior. Instead of describing a boolean
guard-function that returns True if safe, the docstring should reflect that this
class performs LLM-based classification of user input validity and returns an
AgentRunResult to short-circuit execution when the input fails validation. Make
sure the updated docstring explains the LLM classification mechanism and the
AgentRunResult return behavior rather than a simple boolean contract.
In
`@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py`:
- Line 1: The module docstring at the top of the test file contains an incorrect
module name. Change the reference from `_capacity` to `_capability` in the
docstring to correctly reflect the actual module being tested, which is the
question_validity module with a `_capability` suffix, not `_capacity`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e8e1c2b6-691c-44c7-a648-83f15ff52355
📒 Files selected for processing (6)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py
🔇 Additional comments (9)
src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py (3)
32-32: Fix wording typo in classifier prompt text.Line 32 still says “where or” instead of “whether”, which weakens prompt clarity.
As per coding guidelines, “All modules must start with descriptive docstrings explaining purpose” and prompt/documentation text should remain clear and maintainable.
Source: Coding guidelines
72-73: Add full Google-style function/method docstrings consistently.Line 72 onward still has helper/method docstrings that do not include the required structured sections (e.g., Parameters, Returns, Raises), and methods like
_build_prompt/wrap_runremain under-documented for contract clarity.As per coding guidelines, “All functions must have complete type annotations for parameters and return types … and include descriptive docstrings” and “Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes”.
Also applies to: 85-89, 129-147
Source: Coding guidelines
158-159: Normalize classifier output before ALLOWED comparison.Line 158 still relies on exact equality. Variants like lowercase or trailing whitespace can incorrectly reject otherwise allowed questions.
src/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-10: LGTM!src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-8: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py (2)
56-58: This docstring mismatch was already flagged in earlier review feedback and still appears unresolved.
132-507: LGTM!
| """Block or modify user input based on a guardrail check. | ||
|
|
||
| The guard function receives the user prompt and returns True if safe. | ||
|
|
There was a problem hiding this comment.
Update class docstring to match actual behavior.
Line 106 describes a boolean guard-function contract, but this class performs LLM classification and short-circuiting with AgentRunResult. The docstring should reflect the real runtime behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`
around lines 106 - 109, Update the class docstring starting at line 106 to
accurately describe the actual runtime behavior. Instead of describing a boolean
guard-function that returns True if safe, the docstring should reflect that this
class performs LLM-based classification of user input validity and returns an
AgentRunResult to short-circuit execution when the input fails validation. Make
sure the updated docstring explains the LLM classification mechanism and the
AgentRunResult return behavior rather than a simple boolean contract.
| @@ -0,0 +1,506 @@ | |||
| """Unit tests for pydantic_ai_lightspeed.capabilities.question_validity._capacity module.""" | |||
There was a problem hiding this comment.
Correct module name typo in the test docstring.
Line 1 says _capacity, but this test file targets _capability. Please align the docstring to the actual module name.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py`
at line 1, The module docstring at the top of the test file contains an
incorrect module name. Change the reference from `_capacity` to `_capability` in
the docstring to correctly reflect the actual module being tested, which is the
question_validity module with a `_capability` suffix, not `_capacity`.
2b367b1 to
1ba3678
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (7)
tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py (2)
56-57:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the test docstring to match what the test actually verifies.
test_sequence_with_non_text_contentvalidates filtering non-text content while preserving text, not “single-element sequence”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py` around lines 56 - 57, Update the docstring of the test_sequence_with_non_text_content method to accurately describe what the test verifies. The current docstring states "Test extraction from a single-element sequence" but the test actually validates that non-text content is filtered out while text content is preserved. Replace the docstring with an accurate description of this filtering behavior.
1-1:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winCorrect the module docstring target name.
Line 1 references
_capacity, but this test file targets_capability.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py` at line 1, The module docstring on line 1 of test_capability.py incorrectly references the module name as `_capacity` when it should reference `_capability`. Update the docstring to replace `_capacity` with `_capability` to accurately reflect the module being tested.src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py (5)
32-32:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix the prompt typo in the classifier instruction.
Line 32 says “where or a user's question” and should be “whether a user's question”. This wording error weakens prompt clarity for the guardrail classifier.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` at line 32, Fix the grammatical error in the classifier instruction prompt where it currently states "where or a user's question" by changing it to "whether a user's question". This typo appears in the instruction string that defines the classifier's task for determining if a user's question is related to Kubernetes and OpenShift technologies, and correcting it will improve the clarity and professionalism of the prompt.
106-109: 🧹 Nitpick | 🔵 Trivial | ⚡ Quick winAlign the class docstring with actual runtime behavior.
The current docstring describes a boolean guard-function contract, but the implementation performs LLM classification and short-circuits with
AgentRunResulton rejection.As per coding guidelines,
src/**/*.py: “All classes must have descriptive docstrings explaining purpose”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 106 - 109, The docstring for the class describing the guardrail behavior is inaccurate and does not reflect the actual implementation. Update the class docstring to accurately describe that the implementation performs LLM classification (not just a simple boolean guard function) and that it short-circuits execution by returning an AgentRunResult on rejection, rather than simply returning True or False based on a basic guard condition.Source: Coding guidelines
158-158:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize classifier output before ALLOWED/REJECTED check.
Line 158 uses exact equality, so casing/whitespace variants (e.g.,
allowed,ALLOWED\n) are treated as rejected. Normalize before comparison and add tests for these variants.Suggested fix
- if result.text == SUBJECT_ALLOWED: + classification = result.text.strip().upper() + if classification == SUBJECT_ALLOWED: return await handler() # proceed with the real run else: # short-circuit: return the rejection message with shield usage tracked state = GraphAgentState(usage=ctx.usage) return AgentRunResult(output=self.invalid_question_response, _state=state)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` at line 158, The equality check at line 158 using `result.text == SUBJECT_ALLOWED` performs an exact match that fails for normalized variations of the classifier output. Normalize the result.text by stripping leading/trailing whitespace and converting to a consistent case (either upper or lower) before comparing it to SUBJECT_ALLOWED. Additionally, add test cases to verify that the comparison correctly handles output variants including lowercase letters, extra whitespace, and newline characters to ensure robustness against classifier output formatting differences.
72-73: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winAdd complete Google-style docstrings for all functions/methods in this module.
_extract_message_str_from_user_content,_remove_conversation_from_settings,__post_init__,_build_prompt, andwrap_runare still missing complete Google-style sections (Parameters/Returns/Raises) required by repo standards.As per coding guidelines,
src/**/*.py: “All functions must have complete type annotations for parameters and return types … and include descriptive docstrings” and “Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes”.Also applies to: 85-89, 129-129, 132-133, 145-147
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 72 - 73, Add complete Google-style docstrings following repo standards to all functions missing required sections. For `_extract_message_str_from_user_content` (lines 72-73), add Parameters and Returns sections describing the user_content parameter and the returned string. For `_remove_conversation_from_settings` (lines 85-89), add Parameters, Returns, and Raises sections. For `__post_init__` (line 129), add a descriptive summary with Returns section. For `_build_prompt` (lines 132-133), add Parameters, Returns, and Raises sections documenting all arguments and possible exceptions. For `wrap_run` (lines 145-147), add Parameters, Returns, and Raises sections detailing the run_function parameter, return value, and any exceptions raised. Each docstring must follow Google Python conventions with all required sections present.Source: Coding guidelines
18-19:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftReduce reliance on private pydantic-ai internals in runtime logic.
This implementation depends on underscored internals (
pydantic_ai._agent_graph,Model._settings, andAgentRunResult(..., _state=...)). That is brittle across pydantic-ai upgrades and should be replaced with public API paths (or isolated behind a compatibility adapter with version-pinned tests).Also applies to: 99-99, 162-163
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py` around lines 18 - 19, This implementation depends on private pydantic-ai internals across three locations that must be updated. At lines 18-19, replace the import of GraphAgentState from the private pydantic_ai._agent_graph module with the equivalent public API from pydantic_ai if available, or create a compatibility adapter. At lines 99-99 and 162-163, replace all usage of private internals including Model._settings and AgentRunResult with the _state parameter with their public API equivalents from pydantic_ai. If public APIs do not exist for required functionality, isolate these private dependencies behind a version-pinned compatibility adapter to reduce brittleness across pydantic-ai upgrades.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`:
- Around line 150-153: Wrap the model_request call in a try-except block to
handle external service failures gracefully. Define an explicit fail-open or
fail-closed policy (decide whether to return a default response, raise an
exception, or log and continue), then implement that policy in the exception
handler with appropriate error logging. Add test cases that simulate
model_request failures to ensure the exception handling works as intended and
the agent continues or fails appropriately based on your policy choice.
---
Duplicate comments:
In `@src/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py`:
- Line 32: Fix the grammatical error in the classifier instruction prompt where
it currently states "where or a user's question" by changing it to "whether a
user's question". This typo appears in the instruction string that defines the
classifier's task for determining if a user's question is related to Kubernetes
and OpenShift technologies, and correcting it will improve the clarity and
professionalism of the prompt.
- Around line 106-109: The docstring for the class describing the guardrail
behavior is inaccurate and does not reflect the actual implementation. Update
the class docstring to accurately describe that the implementation performs LLM
classification (not just a simple boolean guard function) and that it
short-circuits execution by returning an AgentRunResult on rejection, rather
than simply returning True or False based on a basic guard condition.
- Line 158: The equality check at line 158 using `result.text ==
SUBJECT_ALLOWED` performs an exact match that fails for normalized variations of
the classifier output. Normalize the result.text by stripping leading/trailing
whitespace and converting to a consistent case (either upper or lower) before
comparing it to SUBJECT_ALLOWED. Additionally, add test cases to verify that the
comparison correctly handles output variants including lowercase letters, extra
whitespace, and newline characters to ensure robustness against classifier
output formatting differences.
- Around line 72-73: Add complete Google-style docstrings following repo
standards to all functions missing required sections. For
`_extract_message_str_from_user_content` (lines 72-73), add Parameters and
Returns sections describing the user_content parameter and the returned string.
For `_remove_conversation_from_settings` (lines 85-89), add Parameters, Returns,
and Raises sections. For `__post_init__` (line 129), add a descriptive summary
with Returns section. For `_build_prompt` (lines 132-133), add Parameters,
Returns, and Raises sections documenting all arguments and possible exceptions.
For `wrap_run` (lines 145-147), add Parameters, Returns, and Raises sections
detailing the run_function parameter, return value, and any exceptions raised.
Each docstring must follow Google Python conventions with all required sections
present.
- Around line 18-19: This implementation depends on private pydantic-ai
internals across three locations that must be updated. At lines 18-19, replace
the import of GraphAgentState from the private pydantic_ai._agent_graph module
with the equivalent public API from pydantic_ai if available, or create a
compatibility adapter. At lines 99-99 and 162-163, replace all usage of private
internals including Model._settings and AgentRunResult with the _state parameter
with their public API equivalents from pydantic_ai. If public APIs do not exist
for required functionality, isolate these private dependencies behind a
version-pinned compatibility adapter to reduce brittleness across pydantic-ai
upgrades.
In
`@tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py`:
- Around line 56-57: Update the docstring of the
test_sequence_with_non_text_content method to accurately describe what the test
verifies. The current docstring states "Test extraction from a single-element
sequence" but the test actually validates that non-text content is filtered out
while text content is preserved. Replace the docstring with an accurate
description of this filtering behavior.
- Line 1: The module docstring on line 1 of test_capability.py incorrectly
references the module name as `_capacity` when it should reference
`_capability`. Update the docstring to replace `_capacity` with `_capability` to
accurately reflect the module being tested.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: e93d1679-3307-4e20-a3c9-033b045df9d4
📒 Files selected for processing (6)
src/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.pytests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: E2E: library mode / ci / group 2
- GitHub Check: E2E: server mode / ci / group 3
- GitHub Check: E2E: server mode / ci / group 1
- GitHub Check: E2E: server mode / ci / group 2
- GitHub Check: E2E: library mode / ci / group 3
- GitHub Check: E2E: library mode / ci / group 1
- GitHub Check: E2E Tests for Lightspeed Evaluation job
🧰 Additional context used
📓 Path-based instructions (3)
tests/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
tests/**/*.py: Use pytest for all unit and integration tests; do not use unittest
Usepytest.mark.asynciomarker for async tests
Files:
tests/unit/pydantic_ai_lightspeed/capabilities/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pytests/unit/pydantic_ai_lightspeed/capabilities/question_validity/test_capability.py
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules:from authentication import get_auth_dependency
Llama Stack imports: Usefrom llama_stack_client import AsyncLlamaStackClient
Checkconstants.pyfor shared constants before defining new ones
All modules must start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
All functions must have complete type annotations for parameters and return types, use modern syntax (str | int), and include descriptive docstrings
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying function parameters
Useasync deffor I/O operations and external API calls
Use standard log levels with clear purposes:debug()for diagnostic info,info()for program execution,warning()for unexpected events,error()for serious problems
All classes must have descriptive docstrings explaining purpose and use PascalCase with standard suffixes:Configuration,Error/Exception,Resolver,Interface
Abstract classes must use ABC with@abstractmethoddecorators
Follow Google Python docstring conventions with required sections: Parameters, Returns, Raises, and Attributes for classes
Files:
src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.pysrc/pydantic_ai_lightspeed/capabilities/question_validity/_capability.py
src/**/__init__.py
📄 CodeRabbit inference engine (AGENTS.md)
Package
__init__.pyfiles must contain brief package descriptions
Files:
src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.pysrc/pydantic_ai_lightspeed/capabilities/__init__.py
🔇 Additional comments (4)
src/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-10: LGTM!src/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-7: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/__init__.py (1)
1-1: LGTM!tests/unit/pydantic_ai_lightspeed/capabilities/question_validity/__init__.py (1)
1-1: LGTM!
| ``` | ||
| """ | ||
|
|
||
| model: Model |
There was a problem hiding this comment.
Add a config.py module mirroring configuration from providers
Replace current attributes with config
|
|
||
|
|
||
| @dataclass | ||
| class QuestionValidity(AbstractCapability): |
There was a problem hiding this comment.
Use None as type argument for AbstractCapability which corresponds to Agent[None, str] definition. Add type arguments corresponding to current lightspeed agent implementation where possible.
There was a problem hiding this comment.
So I actually checked this from their codebase.
# TODO (v2): Change the default for all typevars like this from `None` to `object`
AgentDepsT = TypeVar('AgentDepsT', default=None, contravariant=True)
"""Type variable for agent dependencies."""It seems like None is the default value if we don't pass it. But, we can still pass it to be more explicit. I'm fine with either ways. Let me know what do you think! 😁
There was a problem hiding this comment.
Yeah, that's true. It's up to you, just keep it consistent with redaction capability so it's not confusing.
| message=_message, allowed=SUBJECT_ALLOWED, rejected=SUBJECT_REJECTED | ||
| ) | ||
|
|
||
| async def wrap_run( |
There was a problem hiding this comment.
For discussion: overriding before_model_request method rather than wrap_run seems more natural for shields in my opinion.
For question validity, overriding after_model_request is not necessary.
There was a problem hiding this comment.
So just one thing that I would like to bring up for consideration. The output from before_model_request if we bail out with SkipModelRequest(message) will still be processed by after_model_request and the Node lifecycle. Therefore, it has more chance to intertwine with other capabilities.
Follow the agent run lifecycle
run -> node_run -> model request -> ...
I feel like for question validity, we should have it as run-level since there is actually no need to further process the question downward if it's not valid.
It's just my thought. I'll follow whatever you guys think is the most suitable solution here. 😁
There was a problem hiding this comment.
Ok this sounds reasonable.
| return "\n".join(str_arr) | ||
|
|
||
|
|
||
| def _remove_conversation_from_settings(model: Model) -> Model: |
There was a problem hiding this comment.
Model used by question validity shield should be specified in lightspeed stack config not inherited from query request. Therefore, there should always be a separate model instance for the shield (see my other comments). Do you think that this will help with the interference?
| return "\n".join(str_arr) | ||
|
|
||
|
|
||
| def _remove_conversation_from_settings(model: Model) -> Model: |
There was a problem hiding this comment.
or we pass a separate Model instance without "conversation" field in "extra_body" when we create this capability in build_agent.
If we're passing another model instance for this then we don't need this actually. I first came up with this because I don't know what will be passing when this gets created and I guess I was just trying to do some defensive programming there. 😆 Anyway, I will remove this and its unit test since we don't need it 😁
| ``` | ||
| """ | ||
|
|
||
| model: Model |
|
|
||
|
|
||
| @dataclass | ||
| class QuestionValidity(AbstractCapability): |
There was a problem hiding this comment.
So I actually checked this from their codebase.
# TODO (v2): Change the default for all typevars like this from `None` to `object`
AgentDepsT = TypeVar('AgentDepsT', default=None, contravariant=True)
"""Type variable for agent dependencies."""It seems like None is the default value if we don't pass it. But, we can still pass it to be more explicit. I'm fine with either ways. Let me know what do you think! 😁
| message=_message, allowed=SUBJECT_ALLOWED, rejected=SUBJECT_REJECTED | ||
| ) | ||
|
|
||
| async def wrap_run( |
There was a problem hiding this comment.
So just one thing that I would like to bring up for consideration. The output from before_model_request if we bail out with SkipModelRequest(message) will still be processed by after_model_request and the Node lifecycle. Therefore, it has more chance to intertwine with other capabilities.
Follow the agent run lifecycle
run -> node_run -> model request -> ...
I feel like for question validity, we should have it as run-level since there is actually no need to further process the question downward if it's not valid.
It's just my thought. I'll follow whatever you guys think is the most suitable solution here. 😁
|
@Jazzcort when you're done with your changes please run |
1ba3678 to
1235c63
Compare
|
@Jazzcort also please place configuration models into |
|
Got it! I move it there and also fix the Pydocstring check. The Pyright error is not coming from this PR so it should be fine. |
1235c63 to
a25393f
Compare
Implement an LLM-based guardrail that classifies user questions as on-topic (Kubernetes/OpenShift or customized topic) before the main agent processes them. Off-topic questions are short-circuited with a rejection message, bypassing the primary agent entirely. Includes unit tests.
a25393f to
76f5117
Compare
|
@asimurka I think it should be good now. Let me know if there are any changes needed! 😁 |
Description
Implement an LLM-based guardrail that classifies user questions as on-topic (Kubernetes/OpenShift or customized topic) before the main agent processes them. Off-topic questions are short-circuited with a rejection message, bypassing the primary agent entirely. Includes unit tests.
More thorough tests (e2e) can be implemented when we wire up the pydantic AI agent to our endpoints and see if this question validity shield really shields the agent from off-topic questions.
Type of change
Tools used to create PR
The unit tests were drafted by Claude Code which was then reviewed and refined by me. 😁
Related Tickets & Documents
LCORE-1830
Checklist before requesting a review
Testing
Since we have not wired pydantic_ai to our endpoints, you can only do an ad-hoc test with it.
Summary by CodeRabbit
New Features
QuestionValiditycapability to guard agent runs by classifying incoming prompts as allowed or rejected.Documentation
Tests