Skip to content

init_chat_model always uses ChatOpenAI, ignores args, and still calls OpenAI / hits 10-minute timeout with Ollama #474

@ansh-info

Description

@ansh-info

Description

I’m using the ART LangGraph integration and the init_chat_model function with a local Ollama server as the model backend for my agent. The relevant code in my rollout function looks like this:

chat_model = init_chat_model(model.name, temperature=1.0)
react_agent = create_react_agent(chat_model, tools)

I also set up the art.Model to point to my local Ollama server for inference (base URL + any key if needed). However, looking at the implementation of init_chat_model:

def init_chat_model(...):
    config = CURRENT_CONFIG.get()
    return LoggingLLM(
        ChatOpenAI(
            base_url=config["base_url"],
            api_key=config["api_key"],
            model=config["model"],
            temperature=1.0,
        ),
        config["logger"],
    )

Several issues arise:

  1. Arguments are ignored

    • The model, model_provider, configurable_fields, config_prefix, and **kwargs are not used at all.
    • This is surprising because I call init_chat_model(model.name, temperature=1.0), but that model.name and temperature are ignored.
    • Everything is driven purely by the CURRENT_CONFIG contextvar.
  2. Hard binding to ChatOpenAI

    • init_chat_model always creates a ChatOpenAI instance, even if I’m not using OpenAI’s API.
    • When using a local Ollama server, it’s still bound to ChatOpenAI rather than accepting a generic LangChain chat model (e.g. ChatOllama or ChatNVIDIA).
  3. Unexpected OpenAI calls / tight OpenAI coupling

    • Even after pointing the inference base URL to my local Ollama, I still see attempts to call OpenAI endpoints somewhere in the pipeline (especially for judging / RULER).

    • When there’s a runtime error or long inference, the hardcoded 10-minute timeout in LoggingLLM.ainvoke gets hit:

      result = await asyncio.wait_for(
          self.llm.ainvoke(input, config=config), timeout=10 * 60
      )
    • This causes the agent to fail with a timeout, even though:

      • I’m using a local Ollama server.
      • I would like to configure a different timeout for long rollouts.

What I expect

  • init_chat_model should either:

    • Use its arguments (model, model_provider, etc.) and/or accept an explicit ChatModel instance; or
    • Have a clearer contract that it depends entirely on CURRENT_CONFIG and is OpenAI-only.
  • It should be possible to plug in:

    • Local Ollama (ChatOllama or a Litellm-based wrapper)
    • Other OpenAI-compatible endpoints
    • Without silently falling back to OpenAI-specific assumptions.
  • The 10-minute timeout should either be:

    • Configurable; or
    • Documented with guidance on how to override it.

What actually happens

  • Even after configuring a local Ollama backend and setting inference URLs, ART:

    • Uses ChatOpenAI internally.
    • Still exhibits OpenAI-specific behavior.
    • Hits the hard-coded 10-minute timeout during agent inference / error cases.

Request

  • Please make init_chat_model provider-agnostic, or introduce a way to:

    • Pass in a custom ChatModel (e.g. ChatOllama, ChatNVIDIA).
    • Configure the timeout instead of hardcoding 10 * 60.
  • Alternatively, document the intended usage pattern if this function is meant strictly for OpenAI-style backends.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions