Home/Models/Hermes 3 70B Instruct
nousresearch logo

Hermes 3 70B Instruct

by nousresearch · Released Aug 2024

Open Source
73.3
avg score
Rank #33
Compare
Better than 86% of all models
Context
131K tokens (~66 books)
Input $/1M
$0.30
Output $/1M
$0.30
Type
text
License
Open Source
Benchmarks
6 tested
Data updated today
About

Hermes 3 is a generalist language model with many improvements over [Hermes 2](/models/nousresearch/nous-hermes-2-mistral-7b-dpo), including advanced agentic capabilities, much better roleplaying, reasoning, multi-turn conversation, long context coherence, and improvements across the...

Tested on 6 benchmarks with 38.5% average. Top scores: IFEval (76.6%), BBH (HuggingFace) (53.8%), MMLU-PRO (41.4%).

Looking for similar performance at lower cost?
gpt-oss-120b (free) scores 74.2 (101% as good) at $0.00/1M input · 100% cheaper
Capabilities
reasoning
23.4
#98 globally
math
21.0
#157 globally
knowledge
28.1
#180 globally
language
76.6
#55 globally
general
53.8
#7 globally
Benchmark Scores
Compare All
Tested on 6 benchmarks · Ranked across 5 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

23.4
MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

21.0
MMLU-PRO

HuggingFace MMLU-Pro. Harder version of MMLU with 10 answer choices instead of 4 and more challenging questions.

41.4
GPQA

HuggingFace evaluation of GPQA (Graduate-Level Google-Proof Q&A). PhD-level science questions that cannot be easily searched.

14.9
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
hermes-3-llama-3-1-70b
Specifications
  • Typetext
  • Context131K tokens (~66 books)
  • ReleasedAug 2024
  • LicenseOpen Source
  • StatusActive
  • Cost / Message~$0.001
Available On
nousresearch logonousresearch$0.30
Share & Export
Tweet
Hermes 3 70B Instruct is an open-source text AI model by nousresearch, released in August 2024. It has an average benchmark score of 73.3. Context window: 131K tokens.