Skip to content

llm_anthropic: support Anthropic prompt caching #785

@joshuadarron

Description

@joshuadarron

Problem

The llm_anthropic node instantiates ChatAnthropic with no cache configuration:

# nodes/llm_anthropic/anthropic.py:110
self._llm = ChatAnthropic(
    model=model, api_key=apikey, temperature=0, max_tokens=self._modelOutputTokens
)

.rocketride/schema/llm_anthropic.json exposes model / modelTotalTokens / apikey only. There is no way for a pipeline author to opt into Anthropic prompt caching.

In a 28-minute coding-agent run, every claude-opus-4-6 call re-shipped 2–5 KB of stable system prompt plus an accumulating message history. With ~180 LLM calls across the run and large stable prefixes, cache_control: {"type": "ephemeral"} would cut input-token latency on cached blocks by ~85% and cost by ~90% per Anthropic's published numbers. Estimated savings on that run alone: ~470 s (~30%).

Proposed fix

  1. Add an optional caching boolean (or finer-grained struct: { system: bool, history: bool }) to services-catalog.json profiles for llm_anthropic.
  2. When enabled, attach cache_control: {"type": "ephemeral"} to system-prompt content blocks (and optionally to the most recent stable message tail) before passing to ChatAnthropic. Modern langchain-anthropic accepts this either via message content blocks or model_kwargs.
  3. Surface usage.cache_creation_input_tokens / cache_read_input_tokens from responses into the flow trace (see companion issue on token-usage emission) so users can verify cache hits.

Acceptance

  • A pipeline with caching: true on an llm_anthropic profile shows non-zero cache_read_input_tokens on the second and later calls in the same session.
  • Default behavior (no caching field) is unchanged — backwards compatible.
  • Schema + docs updated.

Suggested labels

enhancement, performance, cost, nodes/llm_anthropic

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions