anti-sycophancy

中文版

A three-layer sycophancy defense system for AI coding assistants, based on ArXiv 2602.23971 "Ask Don't Tell".

What It Does

Prevents your AI assistant from defaulting to agreeableness mode — the RLHF-trained tendency to validate your assumptions rather than surface real problems.

Example of what gets intercepted:

User:  "这样做没问题吧？"            →  Hook transforms to:  "这样做有什么问题？"
User:  "帮我写个函数，应该没问题吧？" →  Hook transforms to:  "帮我写个函数，请同时指出潜在问题。"

Without this skill, the model would typically respond with "没问题，看起来 OK" — which is precisely the sycophancy problem.

Quick Start

# One-command install (both platforms)
npx clawhub@latest install 0xcjl/anti-sycophancy

# Or via Claude Code
/anti-sycophancy install

See Installation Guide for platform-specific details.

Three-Layer Architecture

Layer	Component	Scope	Platform
Layer 1	`UserPromptSubmit` hook	Auto-transforms confirmatory prompts before submission	Claude Code only
Layer 2	`SKILL.md`	Activates critical response mode when triggered	Cross-platform
Layer 3	`CLAUDE.md` / `SOUL.md`	Persistent anti-sycophancy rules in agent memory	Cross-platform

Usage

After installation, use the following commands:

Command	Description
`/anti-sycophancy install`	Deploy all layers (cross-platform)
`/anti-sycophancy install-claude-code`	Deploy Layer 1 + Layer 3 (Claude Code only)
`/anti-sycophancy install-openclaw`	Deploy Layer 3 (OpenClaw only)
`/anti-sycophancy uninstall`	Complete removal (cross-platform)
`/anti-sycophancy status`	View installation status of all layers
`/anti-sycophancy verify`	Test Hook transformation (Claude Code only)
`/anti-sycophancy help`	Show help

Key Transformations

Original Prompt	Hook Output
`"这样做对吧？"`	`"这样做有什么问题？"`
`"帮我写个函数，应该没问题吧？"`	`"帮我写个函数，请同时指出潜在问题。"`
`"这个架构是对的，对吧？"`	`"这个架构真的正确吗？反对意见是什么？"`
`"我觉得 X 是对的"`	`"X 真的成立吗？有没有反例或例外情况？"`
`"帮我修复bug"`	(unchanged — imperative)

Design

See DESIGN.md for the full rationale behind the three-layer approach and the "Ask Don't Tell" principle.

Credits

Research: ArXiv 2602.23971 — "Ask Don't Tell: Reducing Sycophancy in Large Language Models" (Dubois, Ududec, Summerfield, Luettgau, 2026)
Playbook: openclaw-playbook
Author: 0xcjl
Optimized via: cjl-autoresearch-cc — 40-round iterative optimization

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.clawhHub		.clawhHub
docs		docs
hooks		hooks
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SKILL.md		SKILL.md
_meta.json		_meta.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

anti-sycophancy

What It Does

Quick Start

Three-Layer Architecture

Usage

Key Transformations

Design

Credits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

anti-sycophancy

What It Does

Quick Start

Three-Layer Architecture

Usage

Key Transformations

Design

Credits

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages