Open AI-for-security validation benchmark: non-LLM scorer + a SOTA-validation loop. Labeled positive corpus withheld pending coordinated disclosure.
-
Updated
Jun 4, 2026 - Python
Open AI-for-security validation benchmark: non-LLM scorer + a SOTA-validation loop. Labeled positive corpus withheld pending coordinated disclosure.
ReplayBench-IoT: reproducible IoT replay-defense benchmark with Monte Carlo sweeps, CI, static demo, and hardware-validation artifacts.
GitHub action for Maester
Product-security LLM benchmark harness for realistic AppSec, supply-chain, and LLM application security evaluations.
The core repository for the Maester module with helper cmdlets that will be called from the Pester tests.
Add a description, image, and links to the security-benchmark topic page so that developers can more easily learn about it.
To associate your repository with the security-benchmark topic, visit your repo's landing page and select "manage topics."