Description
I’m using the ART LangGraph integration and the init_chat_model function with a local Ollama server as the model backend for my agent. The relevant code in my rollout function looks like this:
chat_model = init_chat_model(model.name, temperature=1.0)
react_agent = create_react_agent(chat_model, tools)
I also set up the art.Model to point to my local Ollama server for inference (base URL + any key if needed). However, looking at the implementation of init_chat_model:
def init_chat_model(...):
config = CURRENT_CONFIG.get()
return LoggingLLM(
ChatOpenAI(
base_url=config["base_url"],
api_key=config["api_key"],
model=config["model"],
temperature=1.0,
),
config["logger"],
)
Several issues arise:
-
Arguments are ignored
- The
model, model_provider, configurable_fields, config_prefix, and **kwargs are not used at all.
- This is surprising because I call
init_chat_model(model.name, temperature=1.0), but that model.name and temperature are ignored.
- Everything is driven purely by the
CURRENT_CONFIG contextvar.
-
Hard binding to ChatOpenAI
init_chat_model always creates a ChatOpenAI instance, even if I’m not using OpenAI’s API.
- When using a local Ollama server, it’s still bound to
ChatOpenAI rather than accepting a generic LangChain chat model (e.g. ChatOllama or ChatNVIDIA).
-
Unexpected OpenAI calls / tight OpenAI coupling
-
Even after pointing the inference base URL to my local Ollama, I still see attempts to call OpenAI endpoints somewhere in the pipeline (especially for judging / RULER).
-
When there’s a runtime error or long inference, the hardcoded 10-minute timeout in LoggingLLM.ainvoke gets hit:
result = await asyncio.wait_for(
self.llm.ainvoke(input, config=config), timeout=10 * 60
)
-
This causes the agent to fail with a timeout, even though:
- I’m using a local Ollama server.
- I would like to configure a different timeout for long rollouts.
What I expect
-
init_chat_model should either:
- Use its arguments (
model, model_provider, etc.) and/or accept an explicit ChatModel instance; or
- Have a clearer contract that it depends entirely on
CURRENT_CONFIG and is OpenAI-only.
-
It should be possible to plug in:
- Local Ollama (
ChatOllama or a Litellm-based wrapper)
- Other OpenAI-compatible endpoints
- Without silently falling back to OpenAI-specific assumptions.
-
The 10-minute timeout should either be:
- Configurable; or
- Documented with guidance on how to override it.
What actually happens
Request
-
Please make init_chat_model provider-agnostic, or introduce a way to:
- Pass in a custom
ChatModel (e.g. ChatOllama, ChatNVIDIA).
- Configure the timeout instead of hardcoding
10 * 60.
-
Alternatively, document the intended usage pattern if this function is meant strictly for OpenAI-style backends.
Description
I’m using the ART LangGraph integration and the
init_chat_modelfunction with a local Ollama server as the model backend for my agent. The relevant code in my rollout function looks like this:I also set up the
art.Modelto point to my local Ollama server for inference (base URL + any key if needed). However, looking at the implementation ofinit_chat_model:Several issues arise:
Arguments are ignored
model,model_provider,configurable_fields,config_prefix, and**kwargsare not used at all.init_chat_model(model.name, temperature=1.0), but thatmodel.nameandtemperatureare ignored.CURRENT_CONFIGcontextvar.Hard binding to
ChatOpenAIinit_chat_modelalways creates aChatOpenAIinstance, even if I’m not using OpenAI’s API.ChatOpenAIrather than accepting a generic LangChain chat model (e.g.ChatOllamaorChatNVIDIA).Unexpected OpenAI calls / tight OpenAI coupling
Even after pointing the inference base URL to my local Ollama, I still see attempts to call OpenAI endpoints somewhere in the pipeline (especially for judging / RULER).
When there’s a runtime error or long inference, the hardcoded 10-minute timeout in
LoggingLLM.ainvokegets hit:This causes the agent to fail with a timeout, even though:
What I expect
init_chat_modelshould either:model,model_provider, etc.) and/or accept an explicitChatModelinstance; orCURRENT_CONFIGand is OpenAI-only.It should be possible to plug in:
ChatOllamaor a Litellm-based wrapper)The 10-minute timeout should either be:
What actually happens
Even after configuring a local Ollama backend and setting inference URLs, ART:
ChatOpenAIinternally.Request
Please make
init_chat_modelprovider-agnostic, or introduce a way to:ChatModel(e.g.ChatOllama,ChatNVIDIA).10 * 60.Alternatively, document the intended usage pattern if this function is meant strictly for OpenAI-style backends.