Allow autoevals to support both zod 3 and zod 4 by cpinn · Pull Request #155 · braintrustdata/autoevals

Caitlin Pinn (cpinn) · 2025-12-23T23:37:08Z

Changes

Allow autoevals to install either zod 3 or zod 4.

The typescript sdk was updated to allow zod to be a peer dependency in order to work with either zod 3 or zod 4.

This PR makes a similar update to the autoevals package and runs a matrix with zod 3 and zod 4 over the existing tests.

Our internal uses of autoevals does not allow for a direct upgrade to zod v4 at this time but the peer dependency should unblock users use of using autoevals with zod 4.

github-actions · 2025-12-23T23:40:44Z

Braintrust eval report

Autoevals (caitlin/update-zod4-1780606355)

Score	Average	Improvements	Regressions
NumericDiff	78.1% (-1pp)	8 🟢	10 🔴
Time_to_first_token	9.97tok (+1.39tok)	110 🟢	109 🔴
Llm_calls	1.55 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	528.42tok (-2.58tok)	1 🟢	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	472.62tok (-3.23tok)	114 🟢	89 🔴
Completion_reasoning_tokens	360.15tok (-4.98tok)	92 🟢	73 🔴
Completion_accepted_prediction_tokens	0tok (+0tok)	-	-
Completion_rejected_prediction_tokens	0tok (+0tok)	-	-
Completion_audio_tokens	0tok (+0tok)	-	-
Total_tokens	1001.04tok (-5.81tok)	115 🟢	88 🔴
Estimated_cost	0$ (0$)	102 🟢	78 🔴
Duration	9.97s (+1.39s)	110 🟢	109 🔴
Llm_duration	10.68s (+1.41s)	110 🟢	109 🔴

Use native toJSONSchema() method from Zod v4 instead of relying on zod-to-json-schema library which is not compatible with Zod v4. Fixes "Invalid schema for function" errors where schemas had 'type: "None"' instead of 'type: "object"'. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Replace direct zodToJsonSchema call with schemaToJson helper for classify_statements function to properly use Zod v4's native toJSONSchema() method - Format JSON dataset files and pnpm-lock.yaml with prettier This completes the Zod v4 compatibility fixes for OpenAI function calling schemas. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Caitlin Pinn (cpinn) · 2025-12-29T16:38:52Z

    runs-on: ubuntu-latest
    strategy:
      matrix:
-        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12"]


issue with sdk tuple behavior in 3.8, 3.8 has been end of life since 2024

Upgrade to zod 4.2.1 in preparation of zod 4+ migration. Export from zod/v3 until everything is ready in the braintrust backend

Added Zod as a peer dependency accepting both v3 and v4 (^3.0.0 || ^4.0.0). This ensures consumers have a compatible Zod version installed while allowing flexibility for projects using either Zod 3 or 4. Zod remains in dependencies for build/test purposes, but declaring it as a peer dependency prevents version conflicts when autoevals is used in projects with their own Zod version.

Caitlin Pinn (cpinn) · 2025-12-29T21:27:56Z

-    "openai": "^6.3.0",
-    "zod": "^3.25.76",
-    "zod-to-json-schema": "^3.24.6"
+    "openai": "^6.7.0",


6.7.0 version is necessary in order to properly support zod 4 with fallbacks to zod 3

This reverts commit 7d36c10.

This reverts commit 93f5e28.

This reverts commit 2be0919.

This reverts commit 84e2c48.

This reverts commit 5ec6b50.

This reverts commit 7813882.

This reverts commit bd3c048.

Caitlin Pinn (cpinn) · 2026-01-13T02:28:41Z

Sadly this change is still failing some internal integration tests and I hadn't been able to figure out why yet.

There is still a lot more to be done on the overall zod upgrade.

github-actions · 2026-06-08T19:43:56Z

Braintrust eval report

Autoevals (HEAD-1780948726)

Score	Average	Improvements	Regressions
NumericDiff	79.7% (+2pp)	8 🟢	5 🔴
Time_to_first_token	10.94tok (+0.97tok)	44 🟢	175 🔴
Llm_calls	1.55 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	528.42tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	467.6tok (-5.02tok)	104 🟢	102 🔴
Completion_reasoning_tokens	356.65tok (-3.49tok)	90 🟢	86 🔴
Completion_accepted_prediction_tokens	0tok (+0tok)	-	-
Completion_rejected_prediction_tokens	0tok (+0tok)	-	-
Completion_audio_tokens	0tok (+0tok)	-	-
Total_tokens	996.02tok (-5.02tok)	104 🟢	102 🔴
Estimated_cost	0$ (0$)	90 🟢	93 🔴
Duration	10.94s (+0.97s)	44 🟢	175 🔴
Llm_duration	11.67s (+0.98s)	44 🟢	175 🔴

github-actions · 2026-06-08T21:09:00Z

Braintrust eval report

Autoevals (main-1780952944)

Score	Average	Improvements	Regressions
NumericDiff	79.7% (0pp)	7 🟢	3 🔴
Time_to_first_token	10.55tok (-0.39tok)	111 🟢	108 🔴
Llm_calls	1.55 (+0)	-	-
Tool_calls	0 (+0)	-	-
Errors	0 (+0)	-	-
Llm_errors	0 (+0)	-	-
Tool_errors	0 (+0)	-	-
Prompt_tokens	528.42tok (+0tok)	-	-
Prompt_cached_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_5m_tokens	0tok (+0tok)	-	-
Prompt_cache_creation_1h_tokens	0tok (+0tok)	-	-
Completion_tokens	480.89tok (+13.29tok)	90 🟢	107 🔴
Completion_reasoning_tokens	368tok (+11.35tok)	77 🟢	88 🔴
Completion_accepted_prediction_tokens	0tok (+0tok)	-	-
Completion_rejected_prediction_tokens	0tok (+0tok)	-	-
Completion_audio_tokens	0tok (+0tok)	-	-
Total_tokens	1009.31tok (+13.29tok)	90 🟢	107 🔴
Estimated_cost	0$ (+0$)	78 🟢	97 🔴
Duration	10.55s (-0.39s)	111 🟢	108 🔴
Llm_duration	11.25s (-0.42s)	116 🟢	103 🔴

update to zod 4

1184016

Caitlin Pinn (cpinn) force-pushed the caitlin/update-zod4 branch from 60fce6c to 1184016 Compare December 23, 2025 23:40

Caitlin Pinn (cpinn) and others added 6 commits December 23, 2025 16:24

no transforms in ragas, defualt to zod 4 json parsing

508d006

update autoevals

619eadb

add copy of zod json schema

eb63be0

drop python 3.8 support

9b4b407

Caitlin Pinn (cpinn) marked this pull request as ready for review December 29, 2025 16:37

Caitlin Pinn (cpinn) marked this pull request as draft December 29, 2025 16:38

Caitlin Pinn (cpinn) commented Dec 29, 2025

View reviewed changes

Caitlin Pinn (cpinn) added 2 commits December 29, 2025 10:28

switch back to using native toJSONSchema in zod 4.2

eb2ae2d

use openai 6.7 which fixes zod support for native toJSONSchema

947418a

Caitlin Pinn (cpinn) force-pushed the caitlin/update-zod4 branch from 4d528b9 to 947418a Compare December 29, 2025 18:59

Caitlin Pinn (cpinn) changed the title ~~update to zod 4~~ update to zod 4, stay on zod /v3 for now Dec 29, 2025

Caitlin Pinn (cpinn) added 3 commits December 29, 2025 11:55

Use Zod v3 compatibility mode for production API compatibility

6fb2c09

Upgrade to zod 4.2.1 in preparation of zod 4+ migration. Export from zod/v3 until everything is ready in the braintrust backend

use zod 3 syntax on the template

2be0919

Caitlin Pinn (cpinn) force-pushed the caitlin/update-zod4 branch from e1137b0 to 2be0919 Compare December 29, 2025 20:17

Caitlin Pinn (cpinn) changed the title ~~update to zod 4, stay on zod /v3 for now~~ update to zod 4, stay on zod /v3 for compatibility Dec 29, 2025

Caitlin Pinn (cpinn) added 2 commits December 29, 2025 12:22

make zod a peer dependency in autoevals

93f5e28

bump package version

7d36c10

Caitlin Pinn (cpinn) force-pushed the caitlin/update-zod4 branch from cef27af to 7d36c10 Compare December 29, 2025 21:15

Caitlin Pinn (cpinn) marked this pull request as ready for review December 29, 2025 21:16

Caitlin Pinn (cpinn) commented Dec 29, 2025

View reviewed changes

Caitlin Pinn (cpinn) added 4 commits December 29, 2025 15:31

Merge branch 'main' into caitlin/update-zod4

b4ff806

Revert "bump package version"

bd3c048

This reverts commit 7d36c10.

Revert "make zod a peer dependency in autoevals"

7813882

This reverts commit 93f5e28.

Revert "use zod 3 syntax on the template"

5ec6b50

This reverts commit 2be0919.

Caitlin Pinn (cpinn) added 6 commits December 30, 2025 14:01

Reapply "Add Zod as peer dependency"

a7ce47f

This reverts commit 84e2c48.

Reapply "use zod 3 syntax on the template"

3c73f4c

This reverts commit 5ec6b50.

Reapply "make zod a peer dependency in autoevals"

afe5fa7

This reverts commit 7813882.

Reapply "bump package version"

3b66a4e

This reverts commit bd3c048.

keep zod on v3 for dev

cf5e3a7

Merge branch 'main' into caitlin/update-zod4

666ea6a

Stephen Belanger (Qard) approved these changes Jan 13, 2026

View reviewed changes

Caitlin Pinn (cpinn) marked this pull request as draft January 13, 2026 02:27

Caitlin Pinn (cpinn) added 2 commits January 12, 2026 21:01

update package json zod version

1d2695c

revert the dataset file formatting changes

454a272

Caitlin Pinn (cpinn) changed the title ~~Upgrade autoevals to zod 4~~ Make zod a peer dependency in the autoevals sdk Jan 13, 2026

Caitlin Pinn (cpinn) added 2 commits January 14, 2026 11:33

keep this on v3 version

e9e7083

Merge branch 'main' into caitlin/update-zod4

b12040b

Ronald Koh (ronaldkohhh) mentioned this pull request Jun 4, 2026

autoevals: support Zod v4 via peer dependency #194

Draft

5 tasks

Merge branch 'main' into caitlin/update-zod4

8e93e04

Caitlin Pinn (cpinn) marked this pull request as ready for review June 4, 2026 19:45

make version the same

0488fe7

Caitlin Pinn (cpinn) changed the title ~~Make zod a peer dependency in the autoevals sdk~~ Allow autoeval to support both zod 3 and zod 4 Jun 4, 2026

Caitlin Pinn (cpinn) added 2 commits June 4, 2026 13:43

verify install

12d652e

properly override

0882888

Caitlin Pinn (cpinn) force-pushed the caitlin/update-zod4 branch from 2c3e5bd to 0882888 Compare June 4, 2026 20:52

Caitlin Pinn (cpinn) changed the title ~~Allow autoeval to support both zod 3 and zod 4~~ Allow autoevals to support both zod 3 and zod 4 Jun 4, 2026

Caitlin Pinn (cpinn) requested a review from Abhijeet Prasad (AbhiPrasad) June 5, 2026 20:30

Merge branch 'main' into caitlin/update-zod4

561da54

revert version bump

60f33b1

Abhijeet Prasad (AbhiPrasad) approved these changes Jun 8, 2026

View reviewed changes

Caitlin Pinn (cpinn) merged commit 9eba0fe into main Jun 8, 2026
15 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow autoevals to support both zod 3 and zod 4#155

Allow autoevals to support both zod 3 and zod 4#155
Caitlin Pinn (cpinn) merged 37 commits into
mainfrom
caitlin/update-zod4

Caitlin Pinn (cpinn) commented Dec 23, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Dec 23, 2025 •

edited

Loading

Uh oh!

Caitlin Pinn (cpinn) Dec 29, 2025

Uh oh!

Caitlin Pinn (cpinn) Dec 29, 2025

Uh oh!

Caitlin Pinn (cpinn) commented Jan 13, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Caitlin Pinn (cpinn) commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

github-actions Bot commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

Caitlin Pinn (cpinn) Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Caitlin Pinn (cpinn) Dec 29, 2025

Choose a reason for hiding this comment

Uh oh!

Caitlin Pinn (cpinn) commented Jan 13, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

Uh oh!

github-actions Bot commented Jun 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Braintrust eval report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Caitlin Pinn (cpinn) commented Dec 23, 2025 •

edited

Loading

github-actions Bot commented Dec 23, 2025 •

edited

Loading

github-actions Bot commented Jun 8, 2026 •

edited

Loading

github-actions Bot commented Jun 8, 2026 •

edited

Loading