How much does Mistral Nemo cost?

Mistral Nemo costs $0.02 per million input tokens and $0.03 per million output tokens. For a typical conversation (~2,000 tokens), that's approximately $0.000 per message.

What benchmarks has Mistral Nemo been tested on?

Mistral Nemo has been evaluated on 5 benchmarks. Top scores: GSM8K: 84.2, PIQA: 67.0, Balrog: 17.6.

Is Mistral Nemo open source?

Yes, Mistral Nemo is open source.

How does Mistral Nemo compare to GPT-4o-mini?

Mistral Nemo has an average score of 37.4 while GPT-4o-mini scores 37.5. GPT-4o-mini slightly outperforms Mistral Nemo overall. Mistral Nemo costs $0.02/1M input vs GPT-4o-mini at $0.15/1M input. See full comparison →

Home/Models/Mistral Nemo

Mistral Nemo

Name: Mistral Nemo
Price: 0.02 USD
Author: Mistral AI

by Mistral AI · Released Jul 2024

Open Source

37.4

avg score

Rank #158

Compare

Better than 32% of all models

Context

131K tokens (~66 books)

Input $/1M

$0.02

Output $/1M

$0.03

Type

text

License

Open Source

Benchmarks

5 tested

Data updated today

About

A 12B parameter model with a 128k token context length built by Mistral in collaboration with NVIDIA. The model is multilingual, supporting English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese,...

Tested on 5 benchmarks with 37.2% average. Top scores: GSM8K (84.2%), PIQA (67.0%), Balrog (17.6%).

Capabilities

math

47.5

#81 globally

knowledge

30.4

#175 globally

Benchmark Scores

Compare All

Tested on 5 benchmarks · Ranked across 2 categories

Score Distribution (all 233 models)

0255075100

▲ You are here

mathCompare math →

GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

84.2—

MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

10.8—

knowledgeCompare knowledge →

PIQA

Physical Intuition QA. Tests understanding of everyday physical interactions and commonsense physics.

67.0—

Balrog

Broad Assessment of Language and Reasoning Over Games. Tests strategic and logical reasoning through game scenarios.

17.6—

GPQA diamond

Graduate-level science questions written by PhD experts. Diamond subset contains questions where experts disagree, testing deep understanding.

6.5—

Quick compare:

vs GPT-4o-mini

vs GLM 4 32B

vs Llama 3.2 90B

Excellent (85+) Good (70-85) Average (50-70) Below (<50)

Similar Models

Frequently Asked Questions

Mistral Nemo is an open-source text AI model by Mistral AI, released in July 2024. It has an average benchmark score of 37.4. Context window: 131K tokens.

Benchmarks

GSM8K PIQA Balrog MATH level 5 GPQA diamond

Mistral AI · Provider Mistral AI · Economy All Models Compare Models Pricing Developers · API Context Window · Glossary

Mistral Nemo

Frequently Asked Questions

Related Models

Benchmarks

Related Pages