Compare · ModelsLive · 2 picked · head to head

GPT-5 Mini vs MiniMax M2.5

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

MiniMax M2.5 wins 6 of 11 shared benchmarks. Leads in reasoning · coding · math.

Category leads
reasoning·MiniMax M2.5coding·MiniMax M2.5language·GPT-5 Minimath·MiniMax M2.5knowledge·GPT-5 Mini
Hype vs Reality
GPT-5 Mini
#65 by perf·no signal
QUIET
MiniMax M2.5
#71 by perf·no signal
QUIET
Best value
1.7x better value than GPT-5 Mini
GPT-5 Mini
49.8 pts/$
$1.13/M
MiniMax M2.5
84.8 pts/$
$0.65/M
Vendor risk
One or more vendors flagged
OpenAI logo
OpenAI
$840.0B·Tier 1
Medium risk
minimax logo
MiniMax
$4.0B·Tier 1
Higher risk
Head to head
GPT-5 MiniMiniMax M2.5
ARC-AGI
MiniMax M2.5 leads by +9.3
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
GPT-5 Mini
54.3
MiniMax M2.5
63.7
ARC-AGI-2
MiniMax M2.5 leads by +0.4
ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data.
GPT-5 Mini
4.4
MiniMax M2.5
4.9
LiveBench · Agentic Coding
MiniMax M2.5 leads by +16.7
GPT-5 Mini
35.0
MiniMax M2.5
51.7
LiveBench · Coding
GPT-5 Mini leads by +5.4
GPT-5 Mini
76.1
MiniMax M2.5
70.7
LiveBench · Data Analysis
GPT-5 Mini
49.6
MiniMax M2.5
49.6
LiveBench · If
GPT-5 Mini leads by +7.0
GPT-5 Mini
64.2
MiniMax M2.5
57.2
LiveBench · Language
GPT-5 Mini leads by +14.1
GPT-5 Mini
69.2
MiniMax M2.5
55.1
LiveBench · Mathematics
MiniMax M2.5 leads by +3.0
GPT-5 Mini
74.4
MiniMax M2.5
77.4
LiveBench · Overall
GPT-5 Mini leads by +0.9
GPT-5 Mini
61.0
MiniMax M2.5
60.1
LiveBench · Reasoning
MiniMax M2.5 leads by +0.6
GPT-5 Mini
58.6
MiniMax M2.5
59.3
Terminal Bench
MiniMax M2.5 leads by +7.4
Terminal-Bench 2.0 · evaluates AI agents on real terminal-based coding tasks · writing scripts, debugging, running tests, and managing projects entirely through command-line interaction. Tests both code quality and terminal fluency. Claude Opus 4.7 scores 69.4%, demonstrating significant agentic terminal competence.
GPT-5 Mini
34.8
MiniMax M2.5
42.2
Full benchmark table
BenchmarkGPT-5 MiniMiniMax M2.5
ARC-AGI
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
54.363.7
ARC-AGI-2
ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data.
4.44.9
LiveBench · Agentic Coding
35.051.7
LiveBench · Coding
76.170.7
LiveBench · Data Analysis
49.649.6
LiveBench · If
64.257.2
LiveBench · Language
69.255.1
LiveBench · Mathematics
74.477.4
LiveBench · Overall
61.060.1
LiveBench · Reasoning
58.659.3
Terminal Bench
Terminal-Bench 2.0 · evaluates AI agents on real terminal-based coding tasks · writing scripts, debugging, running tests, and managing projects entirely through command-line interaction. Tests both code quality and terminal fluency. Claude Opus 4.7 scores 69.4%, demonstrating significant agentic terminal competence.
34.842.2
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
OpenAI logoGPT-5 Mini$0.25$2.00400K tokens (~200 books)$6.88
minimax logoMiniMax M2.5$0.15$1.15197K tokens (~98 books)$4.00