Benchmark · CodeCompetitive

CadEval

CadEval · evaluates the ability to generate and reason about Computer-Aided Design code, testing spatial reasoning and engineering knowledge.

Updated 2025-06-17
Models tested
15
Top score
74.0
o3
Median
42.0
min 12.0
Top-5 spread
σ 7.0
wide open

Best score over time · one chart, every benchmark

CADEVAL12 MODELS · FRONTIER RUNNING MAX0255075100SCORE ↑Aug 24Oct 24Jan 25Mar 25Jun 25RELEASE DATE →benchgecko.ai/benchmark/cadeval · frontier
Frontier on CadEval rose from 26.0 to 74.0 in 8 months · +48.0 points · latest leader o3 from OpenAI.
Pink dots = frontier records · 4 totalClick to open model page

Same category · related evaluations