feat(zai): expose configurable max output tokens for GLM models (#161) by proyectoauraorg · Pull Request #274 · Zoo-Code-Org/Zoo-Code

proyectoauraorg · 2026-05-24T00:51:36Z

Summary

Z.ai GLM models (glm-5.1, glm-5-turbo) default to a 20% output clamp. The runtime already honors an explicit modelMaxTokens (createStreamWithThinking: max_tokens = options.modelMaxTokens || getModelMaxOutputTokens(...)), but the settings UI never surfaced a control — those models only exposed reasoning effort. This closes that gap.

Change

New optional capability flag supportsMaxTokens on modelInfoSchema, set on the configurable GLM models in both internationalZAiModels and mainlandZAiModels.
Reuses the existing ThinkingBudget settings component to render a max-output-tokens slider, gated on supportsMaxTokens && !supportsReasoningBudget. It defaults to the runtime's existing clamp when unset (behavior unchanged until the user edits), and persists an explicit value as modelMaxTokens.

Tests

Backend (zai.spec.ts): the flag is present on the configurable GLM models (absent on glm-4.7), and an explicit modelMaxTokens override is sent as max_tokens instead of the clamp.
Webview (ThinkingBudget.spec.tsx): slider renders alongside reasoning effort, defaults to the clamp when unset, reflects an override, does NOT persist on initial render (init vs user-edit), persists on user change, and does not render when the flag is absent.

Closes #161

Summary by CodeRabbit

New Features
- Added a model capability flag for configurable max-output-tokens; supported models show a dedicated max-output-tokens slider that defaults to a runtime-clamped value and honors explicit modelMaxTokens overrides (capped by model limits).
Chores
- Export logic refined to preserve or strip modelMaxTokens and thinking-token fields based on model capabilities.
Tests
- Added tests for capability advertising, slider rendering/behavior and persistence, max-token override/capping, and export rules.

coderabbitai · 2026-05-24T00:51:52Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: c73eb9fb-0150-4da5-af4b-be7379980e87

📥 Commits

Reviewing files that changed from the base of the PR and between 710b01b and 8556b61.

📒 Files selected for processing (9)

packages/types/src/model.ts
packages/types/src/providers/zai.ts
src/api/providers/__tests__/zai.spec.ts
src/core/config/ProviderSettingsManager.ts
src/core/config/__tests__/ProviderSettingsManager.spec.ts
src/shared/__tests__/api.spec.ts
src/shared/api.ts
webview-ui/src/components/settings/ThinkingBudget.tsx
webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx

🚧 Files skipped from review as they are similar to previous changes (5)

src/api/providers/tests/zai.spec.ts
src/core/config/ProviderSettingsManager.ts
packages/types/src/model.ts
packages/types/src/providers/zai.ts
webview-ui/src/components/settings/tests/ThinkingBudget.spec.tsx

📝 Walkthrough

Walkthrough

Adds an optional supportsMaxTokens model capability, marks Z.AI GLM variants as supporting it, surfaces a conditional max-output-tokens slider with runtime defaults and persistence, and adds backend and UI tests validating capability advertising and explicit override behavior.

Changes

Max Output Tokens Capability for Z.A.I GLM

Layer / File(s)	Summary
Type contract and provider metadata `packages/types/src/model.ts`, `packages/types/src/providers/zai.ts`	`modelInfoSchema` adds optional `supportsMaxTokens`; GLM-5.1 and GLM-5-turbo (international and mainland) are marked `supportsMaxTokens: true`.
Backend token override behavior `src/api/providers/__tests__/zai.spec.ts`	Tests verify `supportsMaxTokens` is advertised for configurable GLMs and that explicit `modelMaxTokens` passed into `ZAiHandler` becomes `max_tokens` in the provider `create` call.
Provider settings export filtering `src/core/config/ProviderSettingsManager.ts`, `src/core/config/__tests__/ProviderSettingsManager.spec.ts`	Splits export gating so `modelMaxThinkingTokens` is removed only when reasoning budgets are unsupported, while `modelMaxTokens` is removed only when the model supports neither reasoning budgets nor configurable max tokens; tests added for both cases.
Runtime max-output computation `src/shared/api.ts`, `src/shared/__tests__/api.spec.ts`	`getModelMaxOutputTokens` now honors `settings.modelMaxTokens` for models with `supportsMaxTokens`, capping overrides to the model ceiling; tests added for override and ceiling behavior.
ThinkingBudget UI integration `webview-ui/src/components/settings/ThinkingBudget.tsx`	Imports `getModelMaxOutputTokens`, conditionally renders a standalone `max-output-tokens` slider for `supportsMaxTokens` models, introduces `renderMaxTokensSlider`, and reworks reasoning-effort rendering to include the control.
UI test coverage for max-tokens behavior `webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx`	Adds tests for slider rendering with `supportsMaxTokens`, default/clamped initialization, explicit override behavior, non-persistence on initial render, persistence on user interaction, and conditional hiding when flag is absent.

Sequence Diagram

sequenceDiagram
  participant ThinkingBudget as ThinkingBudget (Webview UI)
  participant SettingsStore as SettingsStore (setApiConfigurationField)
  participant ProviderSettings as ProviderSettingsManager
  participant ZAiHandler as ZAiHandler (Server)
  participant ZAiAPI as Z.ai API (create)

  ThinkingBudget->>SettingsStore: persist modelMaxTokens change
  SettingsStore->>ProviderSettings: saved apiConfiguration/profile
  ProviderSettings->>ZAiHandler: runtime loads apiConfiguration
  ZAiHandler->>ZAiAPI: create(request with max_tokens = modelMaxTokens)
  ZAiAPI-->>ZAiHandler: response
  ZAiHandler-->>ThinkingBudget: result/state

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

edelauna
taltas
hannesrudolph
JamesRobert20
navedmerchant

Poem

A rabbit twists a tiny knob, 🐇
Numbers tumble, tokens bob,
Clamp gives way to a gentle slide,
User sets how far output strides,
Hop, persist, and watch it glide.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding a configurable max output tokens feature for Z.ai GLM models.
Description check	✅ Passed	The PR description comprehensively covers the implementation details, testing approach, and rationale for the changes.
Linked Issues check	✅ Passed	All coding requirements from issue `#161` are met: supportsMaxTokens flag added, UI slider rendered for applicable models, modelMaxTokens persisted and honored, default clamp preserved when unset.
Out of Scope Changes check	✅ Passed	All changes directly support the stated objective of exposing configurable max output tokens for Z.ai GLM models; no unrelated modifications detected.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

packages/types/src/model.ts

ESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox.

packages/types/src/providers/zai.ts

ESLint skipped: the ESLint configuration for this file references a package that is not available in the sandbox.

src/api/providers/__tests__/zai.spec.ts

ESLint skipped: missing config or dependency (missing-dependency). The ESLint configuration references a package that is not available in the sandbox.

6 others

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx`:
- Around line 320-326: The test fixture glmApiConfiguration is widening
apiProvider to string causing a TS2322 error; fix by typing the fixture with the
ProviderSettings literal type (e.g., change the declaration of
glmApiConfiguration to use "satisfies ProviderSettings" — optionally combined
with "as const" to preserve literal types) so the apiProvider remains the
literal "zai" when passed into ThinkingBudget with defaultProps.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 995ad486-eba2-459e-9c28-6802c53ac7db

📥 Commits

Reviewing files that changed from the base of the PR and between b5c5e21 and 9bd5fbd.

📒 Files selected for processing (5)

packages/types/src/model.ts
packages/types/src/providers/zai.ts
src/api/providers/__tests__/zai.spec.ts
webview-ui/src/components/settings/ThinkingBudget.tsx
webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx

codecov · 2026-05-24T01:24:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

edelauna · 2026-05-24T04:23:34Z

Profile export strips modelMaxTokens for GLM models (src/core/config/ProviderSettingsManager.ts:572)

The export cleanup at line 572 deletes modelMaxTokens when supportsReasoningBudget is falsy. GLM models only advertise supportsMaxTokens + supportsReasoningEffort — neither reasoning-budget flag — so any custom max-tokens value the user sets via the new slider will be silently stripped on profile export. Re-importing the profile then resets the slider to the default clamp.

Suggested fix:

const keepMaxTokens = supportsReasoningBudget || modelInfo.supportsMaxTokens
if (!supportsReasoningBudget) {
    delete configs[name].modelMaxThinkingTokens
}
if (!keepMaxTokens) {
    delete configs[name].modelMaxTokens
}

(This file isn't in the PR diff so can't be posted as an inline comment.)

edelauna · 2026-05-24T04:23:21Z

+		) : null
+
 	// Models with supportsReasoningBinary (binary reasoning) show a simple on/off toggle
 	if (isReasoningSupported) {


If a model ever has both supportsReasoningBinary: true and supportsMaxTokens: true, the standalone slider won't render — this block returns before maxOutputTokensControl is reached. Worth including {maxOutputTokensControl} here alongside the checkbox, even if no current model hits this?

Great catch — you're right that supportsReasoningBinary models (like deepseek-v3.2) would have their modelMaxThinkingTokens incorrectly stripped under the current logic.\n\nI've updated the guard to include supportsReasoningBinary in the reasoning-capable check:\n\nts\nconst supportsReasoning =\n modelInfo.supportsReasoningBudget ||\n modelInfo.requiredReasoningBudget ||\n modelInfo.supportsReasoningBinary\n\nif (supportsReasoning) {\n // keep modelMaxThinkingTokens — budget, required, or binary\n} else {\n delete configs[name].modelMaxThinkingTokens\n}\n\n\nThis ensures binary-reasoning models retain the field while non-reasoning models still get it cleaned up. Thanks for flagging this!

proyectoauraorg · 2026-05-24T09:33:13Z

While reviewing the round-trip behavior for this feature, I found that ProviderSettingsManager.export() strips modelMaxTokens for any model without a reasoning budget — which would silently wipe the GLM max-output value this PR introduces on export/import. I've split the two deletions so modelMaxTokens is preserved whenever the model sets supportsMaxTokens (as GLM does), while modelMaxThinkingTokens still requires a reasoning budget. Added tests covering both branches (GLM preserves, a plain model strips both).

proyectoauraorg · 2026-05-26T08:37:55Z

@edelauna Thanks for the detailed code review identifying the modelMaxTokens bug!

I've created a fix PR that addresses the scoping issue: PR #332

The fix extracts maxOutputTokensControl as a named JSX variable declared before both the binary reasoning and budget reasoning conditional branches, making the max output tokens slider available for all reasoning models (including supportsReasoningBinary models like Zhipu GLM).

Could you please review PR #332 when you get a chance? 🙏

proyectoauraorg · 2026-05-28T22:20:06Z

Fixed the Windows-only platform-unit-test failure (ProviderSettingsManager > Export > should preserve modelMaxTokens … (e.g. GLM)). It passed on ubuntu/local but failed on Windows.

Root cause: Windows CI runs vitest with poolOptions.forks.singleFork (all spec files share one process). ProviderSettingsManager.spec used the real buildApiHandler, so a sibling spec's module-level vi.mock("../../../api") (e.g. importExport.spec) could bleed in under reduced isolation — making modelInfo.supportsMaxTokens read as undefined, so export() dropped modelMaxTokens and the assertion failed. I reproduced the exact contamination locally by forcing --pool=forks --poolOptions.forks.singleFork --no-isolate.

Fix: mock buildApiHandler directly in ProviderSettingsManager.spec, deriving model capabilities from the real @roo-code/types definitions via vi.importActual. The Export tests now exercise the token-field filtering deterministically and are immune to cross-file api mocks, while still validating real capabilities (glm-5.1 supportsMaxTokens, claude-3-5-haiku neither). Locally: tsc + eslint clean, core/config 231/231, and the suite passes under the exact Windows singleFork config.

coderabbitai · 2026-05-28T22:24:03Z

Actionable comments posted: 0

edelauna · 2026-05-29T02:24:35Z

+						max={modelInfo.maxTokens}
+						step={1024}
+						value={[standaloneMaxOutputTokens]}
+						onValueChange={([value]) => setApiConfigurationField("modelMaxTokens", value)}


When GLM persists modelMaxTokens here, should getModelMaxOutputTokens also honor supportsMaxTokens? Z.ai sends the override as max_tokens, but task context checks call the shared helper and still clamp non-reasoning-budget models.

edelauna · 2026-05-29T02:24:35Z

+		}
+	}
+	return {
+		buildApiHandler: (config: any) => ({


Could this test use the real buildApiHandler and mock only external clients instead? Since export() depends on buildApiHandler(config).getModel().info, this fake bypasses the contract the test is trying to cover.

edelauna · 2026-05-29T02:24:35Z


+	// Standalone max output tokens slider for models that advertise `supportsMaxTokens`
+	// (e.g. Z.ai GLM) but do not surface the reasoning-budget control.
+	const maxOutputTokensControl =


Is it worth extracting the max-tokens slider markup into a small local render helper? This repeats the same label, Slider, value display, and modelMaxTokens wiring used in the reasoning-budget branch below.

edelauna · 2026-05-29T02:24:35Z


 import { ProviderSettingsManager, ProviderProfiles, SyncCloudProfilesResult } from "../ProviderSettingsManager"

+// `export()` builds an API handler per profile to read model capabilities. Mock


Could this comment be trimmed to the durable reason for the mock? The PR/debug-history details make the setup harder to scan than a short note about deterministic model capabilities.

…Code-Org#161) Add a standalone "max output tokens" slider for models that advertise `supportsMaxTokens` (e.g. Z.ai GLM) but do not surface a reasoning budget, and send the chosen value to the provider as `max_tokens`. - ThinkingBudget: extract the max-tokens slider into a small render helper shared by the standalone control and the reasoning-budget branch, and surface it for binary-reasoning models that also support a configurable max. - getModelMaxOutputTokens: honor the user's `modelMaxTokens` override for `supportsMaxTokens` models (capped at the model ceiling) instead of the 20% context-window clamp, so the runtime budget matches what is sent to the provider. - ProviderSettingsManager.export(): preserve `modelMaxTokens` for configurable-max models while dropping it for models that support neither reasoning budgets nor a configurable max.

proyectoauraorg · 2026-06-01T00:19:40Z

Thanks! Addressed the review:

Extracted a renderMaxTokensSlider helper in ThinkingBudget shared by the standalone control and the reasoning-budget branch, and surfaced {maxOutputTokensControl} in the binary-reasoning branch so a model with both supportsReasoningBinary and supportsMaxTokens still gets the slider.
getModelMaxOutputTokens now honors supportsMaxTokens: it returns the user's modelMaxTokens override (capped at the model ceiling) instead of the 20% context-window clamp, so the runtime budget matches what Z.ai receives as max_tokens. Added unit tests.
Trimmed the ProviderSettingsManager mock comment to the durable reason.
On the real buildApiHandler (vs the mock): I kept the explicit mock — it already drives the filtering off the real @roo-code/types model definitions, keeps this suite isolated from sibling specs that mock ../../../api (the singleFork cross-file leak), and the real handler → getModel().info contract for GLM is now covered directly in zai.spec.ts. Switching the Export tests to the real handler would also build a handler for the retired groq profile case. Happy to switch if you'd prefer — let me know.

Squashed to one commit. tsc, eslint, and the affected suites (api / ProviderSettingsManager / zai / ThinkingBudget — 142 tests) pass locally.

proyectoauraorg requested review from JamesRobert20, edelauna, hannesrudolph, navedmerchant and taltas as code owners May 24, 2026 00:51

coderabbitai Bot reviewed May 24, 2026

View reviewed changes

Comment thread webview-ui/src/components/settings/__tests__/ThinkingBudget.spec.tsx Outdated

edelauna reviewed May 24, 2026

View reviewed changes

proyectoauraorg force-pushed the feat/161-zai-glm-max-output branch from a99ea3c to 6cdbb6b Compare May 26, 2026 05:59

proyectoauraorg mentioned this pull request May 26, 2026

fix(ThinkingBudget): extract maxOutputTokensControl to resolve undefined reference in binary reasoning path #332

Open

edelauna reviewed May 29, 2026

View reviewed changes

drvaquera force-pushed the feat/161-zai-glm-max-output branch from 710b01b to 8556b61 Compare June 1, 2026 00:19


		import { ProviderSettingsManager, ProviderProfiles, SyncCloudProfilesResult } from "../ProviderSettingsManager"

		// `export()` builds an API handler per profile to read model capabilities. Mock

Conversation

proyectoauraorg commented May 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change

Tests

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov Bot commented May 24, 2026

Codecov Report

Uh oh!

edelauna commented May 24, 2026

Uh oh!

edelauna May 24, 2026

Choose a reason for hiding this comment

Uh oh!

proyectoauraorg May 26, 2026

Choose a reason for hiding this comment

Uh oh!

proyectoauraorg commented May 24, 2026

Uh oh!

proyectoauraorg commented May 26, 2026

Uh oh!

proyectoauraorg commented May 28, 2026

Uh oh!

coderabbitai Bot commented May 28, 2026

Uh oh!

edelauna May 29, 2026

Choose a reason for hiding this comment

Uh oh!

edelauna May 29, 2026

Choose a reason for hiding this comment

Uh oh!

edelauna May 29, 2026

Choose a reason for hiding this comment

Uh oh!

edelauna May 29, 2026

Choose a reason for hiding this comment

Uh oh!

proyectoauraorg commented Jun 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

proyectoauraorg commented May 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 24, 2026 •

edited

Loading