Compare · ModelsLive · 2 picked · head to head

MiniMax M2.5 vs GPT-5 Mini

Side by side · benchmarks, pricing, and signals you can act on.

Winner summary

MiniMax M2.5 wins 7 of 11 shared benchmarks. Leads in reasoning · coding · math.

Category leads
reasoning·MiniMax M2.5coding·MiniMax M2.5language·GPT-5 Minimath·MiniMax M2.5knowledge·GPT-5 Mini
Hype vs Reality
MiniMax M2.5
#71 by perf·no signal
QUIET
GPT-5 Mini
#65 by perf·no signal
QUIET
Best value
1.7x better value than GPT-5 Mini
MiniMax M2.5
84.8 pts/$
$0.65/M
GPT-5 Mini
49.8 pts/$
$1.13/M
Vendor risk
One or more vendors flagged
minimax logo
MiniMax
$4.0B·Tier 1
Higher risk
OpenAI logo
OpenAI
$840.0B·Tier 1
Medium risk
Head to head
MiniMax M2.5GPT-5 Mini
ARC-AGI
MiniMax M2.5 leads by +9.3
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
MiniMax M2.5
63.7
GPT-5 Mini
54.3
ARC-AGI-2
MiniMax M2.5 leads by +0.4
ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data.
MiniMax M2.5
4.9
GPT-5 Mini
4.4
LiveBench · Agentic Coding
MiniMax M2.5 leads by +16.7
MiniMax M2.5
51.7
GPT-5 Mini
35.0
LiveBench · Coding
GPT-5 Mini leads by +5.4
MiniMax M2.5
70.7
GPT-5 Mini
76.1
LiveBench · Data Analysis
MiniMax M2.5
49.6
GPT-5 Mini
49.6
LiveBench · If
GPT-5 Mini leads by +7.0
MiniMax M2.5
57.2
GPT-5 Mini
64.2
LiveBench · Language
GPT-5 Mini leads by +14.1
MiniMax M2.5
55.1
GPT-5 Mini
69.2
LiveBench · Mathematics
MiniMax M2.5 leads by +3.0
MiniMax M2.5
77.4
GPT-5 Mini
74.4
LiveBench · Overall
GPT-5 Mini leads by +0.9
MiniMax M2.5
60.1
GPT-5 Mini
61.0
LiveBench · Reasoning
MiniMax M2.5 leads by +0.6
MiniMax M2.5
59.3
GPT-5 Mini
58.6
Terminal Bench
MiniMax M2.5 leads by +7.4
Terminal-Bench 2.0 · evaluates AI agents on real terminal-based coding tasks · writing scripts, debugging, running tests, and managing projects entirely through command-line interaction. Tests both code quality and terminal fluency. Claude Opus 4.7 scores 69.4%, demonstrating significant agentic terminal competence.
MiniMax M2.5
42.2
GPT-5 Mini
34.8
Full benchmark table
BenchmarkMiniMax M2.5GPT-5 Mini
ARC-AGI
ARC-AGI · the original Abstraction and Reasoning Corpus, testing whether AI can solve novel visual pattern recognition tasks without memorization.
63.754.3
ARC-AGI-2
ARC-AGI-2 · the second iteration of the Abstraction and Reasoning Corpus, testing novel pattern recognition and abstract reasoning without prior training data.
4.94.4
LiveBench · Agentic Coding
51.735.0
LiveBench · Coding
70.776.1
LiveBench · Data Analysis
49.649.6
LiveBench · If
57.264.2
LiveBench · Language
55.169.2
LiveBench · Mathematics
77.474.4
LiveBench · Overall
60.161.0
LiveBench · Reasoning
59.358.6
Terminal Bench
Terminal-Bench 2.0 · evaluates AI agents on real terminal-based coding tasks · writing scripts, debugging, running tests, and managing projects entirely through command-line interaction. Tests both code quality and terminal fluency. Claude Opus 4.7 scores 69.4%, demonstrating significant agentic terminal competence.
42.234.8
Pricing · per 1M tokens · projected $/mo at 10M tokens
ModelInputOutputContextProjected $/mo
minimax logoMiniMax M2.5$0.15$1.15197K tokens (~98 books)$4.00
OpenAI logoGPT-5 Mini$0.25$2.00400K tokens (~200 books)$6.88