API
Benchmarks/Cybench

Cybench

Cybench β€” evaluates AI on real Capture-The-Flag cybersecurity challenges, testing vulnerability analysis, exploitation, and security reasoning.

17
Models Tested
55.0
Top Score
19.6
Average Score