AudioMultiChallenge · Text Output
Distribution
Where models cluster
Correlated benchmarks
Pearson r · original research
Full rankings
3 models tested · sorted by score
| # | Model | Score |
|---|---|---|
| 1 | 46.9 | |
| 2 | 40.0 | |
| 3 | 26.3 |
Frequently asked
Pulled from the AudioMultiChallenge · Text Output dataset · updated daily
What does AudioMultiChallenge · Text Output measure?
AudioMultiChallenge · Text Output is a knowledge benchmark in the BenchGecko catalog. 3 AI models have been tested on it. Scores range from 26.3 to 46.9 out of 100.
Which model leads on AudioMultiChallenge · Text Output?
Gemini 2.5 Pro from Google DeepMind leads AudioMultiChallenge · Text Output with a score of 46.9. The median score across 3 tested models is 40.0.
Is AudioMultiChallenge · Text Output saturated?
No · the top score is 46.9 out of 100 (47%). There is still meaningful room for improvement on AudioMultiChallenge · Text Output.
What makes AudioMultiChallenge · Text Output distinctive?
AudioMultiChallenge · Text Output is a knowledge benchmark with limited overlap to the rest of the catalog · it measures capabilities that are not well-covered by other benchmarks we track.
How often is AudioMultiChallenge · Text Output data refreshed?
BenchGecko pulls updates daily. New model scores on AudioMultiChallenge · Text Output appear as soon as they are published by Epoch AI or the model provider.
Top on AudioMultiChallenge · Text Output
Gemini 2.5 Pro · 46.9Gemini 2.5 Flash · 40.0Voxtral Small 24B 2507 · 26.3More knowledge benchmarks
Same category · related evaluations