Your current environment
Since v0.18.0 (#36483), OpenAIServingChat.__init__ requires a new keyword-only argument openai_serving_render: OpenAIServingRender. This is a silent breaking change for anyone constructing OpenAIServingChat directly. We are using KubeRay + Ray Serve + FastAPI encounter this issue.
Code that worked on v0.17.0:
from vllm.engine.async_llm_engine import AsyncLLMEngine
from vllm.engine.arg_utils import AsyncEngineArgs
from vllm.entrypoints.openai.chat_completion.serving import OpenAIServingChat
from vllm.entrypoints.openai.models.serving import OpenAIServingModels
from vllm.entrypoints.openai.models.protocol import BaseModelPath
engine = AsyncLLMEngine.from_engine_args(AsyncEngineArgs(model="Qwen/Qwen3-4B-Instruct-2507"))
serving_models = OpenAIServingModels(
engine_client=engine,
base_model_paths=[BaseModelPath(name="my-model", model_path="Qwen/Qwen3-4B-Instruct-2507")],
lora_modules=[],
)
OpenAIServingChat(
engine,
models=serving_models,
response_role="assistant",
request_logger=None,
chat_template=None,
chat_template_content_format="auto",
)
Error on v0.18.0+
TypeError: OpenAIServingChat.__init__() missing 1 required keyword-only argument: 'openai_serving_render'
Expected behavior
openai_serving_render should default to None with an internal auto-construction fallback so existing call sites keep working.
🐛 Describe the bug
N/A
Before submitting a new issue...
Your current environment
Since v0.18.0 (#36483),
OpenAIServingChat.__init__requires a new keyword-only argumentopenai_serving_render: OpenAIServingRender. This is a silent breaking change for anyone constructingOpenAIServingChatdirectly. We are using KubeRay + Ray Serve + FastAPI encounter this issue.Code that worked on v0.17.0:
Error on v0.18.0+
Expected behavior
openai_serving_rendershould default toNonewith an internal auto-construction fallback so existing call sites keep working.🐛 Describe the bug
N/A
Before submitting a new issue...