Skip to content

In Jan app LLM model performance. #8165

@shrisha108

Description

@shrisha108

Hello,
I'm running llama.cpp on a linux headless server in my local network. First of all I love Jan app, very nice and neat, bunch of very useful settings, MCP servers, creating assistants for different use cases, etc, so using it as daily driver. And yesterday I updated llama.cpp to the latest and out of curiosity tried the build in llama.cpp chat. And to my surprise it is actually giving me almost twice t\s of what I usually getting in a Jan. Same question was asked to same LLM model in same warm-up situation. I use no assistant in Jan to match the llama.cpp native chat. So llama.cpp chat gives 52 t\s and Jan 28 t\s. Tried few times. Could it be because in Jan I'm using it as OpenAI compatible provider? Would be great to match the native chat :)
Anyway please let me know if I just have to adjust some settings I don't know about or it is expected from Jan app according to App architecture.
Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions