LIVETracking 994 AI models from 267 providers.

Gecko Tests·powered by GeckoBench · AI Bias, Censorship, IQ & Politics View Gecko Tests Build your own chart

Home/Models/Grok 3 Mini

Grok 3 Mini

by xAI · Released Jun 2025

46.0

avg score

Rank #120

Better than 48% of all models

Context

131K tokens (~66 books)

Input $/1M

$0.30

Output $/1M

$0.50

Type

text

License

Proprietary

Benchmarks

11 tested

Data updated today

About

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible.

Tested on 11 benchmarks with 46.6% average. Top scores: MATH level 5 (90.9%), OTIS Mock AIME 2024-2025 (77.8%), Lech Mazur Writing (73.5%).

Looking for similar performance at lower cost?
Qwen3 235B A22B Instruct 2507 scores 45.7 (99% as good) at $0.07/1M input · 76% cheaper

Capabilities

coding

45.9

#79 globally

reasoning

8.5

#146 globally

math

58.2

#52 globally

knowledge

57.4

#62 globally

Benchmark Scores

Tested on 11 benchmarks · Ranked across 4 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

codingCompare coding →

Multi-language code editing from Aider. Tests editing ability across Python, JavaScript, TypeScript, Java, C++, Go, Rust, and more.

49.3—

Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.

42.6—

reasoningCompare reasoning →

Abstraction and Reasoning Corpus. Tests fluid intelligence through novel visual pattern recognition puzzles. Core measure of general intelligence.

16.5—

ARC-AGI 2, harder sequel to ARC. More complex abstract reasoning patterns that test generalization ability beyond training data.

0.4—

mathCompare math →

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

90.9—

OTIS Mock AIME 2024-2025

Mock AIME (American Invitational Mathematics Exam) problems from OTIS. Tests mathematical competition performance.

77.8—

FrontierMath-2025-02-28-Private

Original research-level math problems created by professional mathematicians. Problems are unpublished and cannot be memorized.

5.9—

Quick compare:

vs Claude Opus 4

vs Qwen3 235B A22B Instruct 2507

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Model Family · xAI Grok 3

$3.00/M in131Kctx13 benchmarks

Grok 3 BetaApr 2025

$3.00/M in131Kctx6 benchmarks

Grok 3 MiniJun 2025

$0.30/M in(-2.70)131Kctx11 benchmarks

Grok 3 Mini BetaApr 2025

$0.30/M in131Kctx7 benchmarks

See the full Grok 3 family →

Recently Happened

Grok 3 Mini marked as deprecated by xAI

Mar 12, 2026

Similar Models

Qwen3 235B A22B Instruct 2507

Links

Info

xAI Pricing explorer Developers · API

Research

Technical Report

Documentation

API Docs Playground

Community

BenchGecko API

grok-3-mini

Specifications

Typetext
Context131K tokens (~66 books)
ReleasedJun 2025
LicenseProprietary
StatusActive
Cost / Message~$0.001

Available On

xAI$0.30

Categories

coding reasoning math knowledge

Learn More

context-window transformer tokens

Share & Export

Related Models

Qwen3 235B A22B Instruct 2507

Gemini 2.0 Flash Lite

Frequently Asked Questions

Grok 3 Mini is a proprietary text AI model by xAI, released in June 2025. It has an average benchmark score of 46.0. Context window: 131K tokens.

Related Models

Claude Opus 4 · Anthropic Qwen3 235B A22B Instruct 2507 · Alibaba Qwen o1-preview · OpenAI Gemini 2.0 Flash Lite · Google DeepMind Gemini 2.0 Flash · Google DeepMind

Benchmarks

MATH level 5 OTIS Mock AIME 2024-2025 Lech Mazur Writing GPQA diamond Fiction.LiveBench

Related Pages

xAI · Provider xAI · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary