API
Benchmarks/Terminal Bench

Terminal Bench

Terminal Bench β€” tests the ability to accomplish real-world tasks using terminal commands, evaluating shell scripting and CLI tool proficiency.

27
Models Tested
78.4
Top Score
40.3
Average Score