Summary
LocalEvalService._evaluate_single_inference_result() crashes with TypeError: object of type 'NoneType' has no len() when inference fails and InferenceResult.inferences is left as None.
The failing path is visible in google-adk==1.34.0:
if eval_case.conversation_scenario is None and len(
inference_result.inferences
) != len(eval_case.conversation):
...
InferenceResult.inferences is optional and _perform_inference_single_eval_item() can produce a failure result with status=InferenceStatus.FAILURE, error_message=..., and inferences=None (for example when auth refresh, network, or model calls fail). The later len(inference_result.inferences) aborts the entire eval run instead of returning a structured per-case failure.
Reproduction
Run any eval case where inference fails before invocations are generated. One local repro from an ADK project using Vertex/ADC with expired ADC credentials:
cd idc-runtime
GOOGLE_GENAI_USE_VERTEXAI=True \
GOOGLE_CLOUD_PROJECT=idc-runtime-prod \
GOOGLE_CLOUD_LOCATION=global \
uv run agents-cli eval run --evalset tests/eval/evalsets/build_role.evalset.json
The underlying inference error is:
google.auth.exceptions.RefreshError: Reauthentication is needed. Please run `gcloud auth application-default login` to reauthenticate.
Then ADK crashes in the eval framework:
File ".../google/adk/evaluation/local_eval_service.py", line 273, in _evaluate_single_inference_result
if eval_case.conversation_scenario is None and len(
^^^^
TypeError: object of type 'NoneType' has no len()
The same issue can surface during multi-case eval runs; several tasks can fail inference and then all hit len(None) while asyncio.as_completed(...) is collecting results.
Expected behavior
If inference failed and inference_result.inferences is None, ADK should not call len(None). It should surface a structured failed EvalCaseResult using the existing inference_result.error_message, then allow the rest of the eval run to continue.
Local workaround
We are carrying a narrow in-process guard that checks inference_result.inferences is None before delegating to the original method and returns an EvalCaseResult(final_eval_status=EvalStatus.FAILED, ...) for that case. We also serialize CI eval cases while this failure path is unresolved.
Environment
google-adk==1.34.0
- Python 3.12
- macOS local repro; also observed from GitHub Actions runners when live Vertex evals reach an inference failure
Summary
LocalEvalService._evaluate_single_inference_result()crashes withTypeError: object of type 'NoneType' has no len()when inference fails andInferenceResult.inferencesis left asNone.The failing path is visible in
google-adk==1.34.0:InferenceResult.inferencesis optional and_perform_inference_single_eval_item()can produce a failure result withstatus=InferenceStatus.FAILURE,error_message=..., andinferences=None(for example when auth refresh, network, or model calls fail). The laterlen(inference_result.inferences)aborts the entire eval run instead of returning a structured per-case failure.Reproduction
Run any eval case where inference fails before invocations are generated. One local repro from an ADK project using Vertex/ADC with expired ADC credentials:
The underlying inference error is:
Then ADK crashes in the eval framework:
The same issue can surface during multi-case eval runs; several tasks can fail inference and then all hit
len(None)whileasyncio.as_completed(...)is collecting results.Expected behavior
If inference failed and
inference_result.inferences is None, ADK should not calllen(None). It should surface a structured failedEvalCaseResultusing the existinginference_result.error_message, then allow the rest of the eval run to continue.Local workaround
We are carrying a narrow in-process guard that checks
inference_result.inferences is Nonebefore delegating to the original method and returns anEvalCaseResult(final_eval_status=EvalStatus.FAILED, ...)for that case. We also serialize CI eval cases while this failure path is unresolved.Environment
google-adk==1.34.0