Home/Models/phi-3-medium 14B
Microsoft logo

phi-3-medium 14B

by Microsoft · Released Jan 2024

Open Source
69.0
avg score
Rank #41
Compare
Better than 82% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Open Source
Benchmarks
10 tested
Data updated today
About

Tested on 10 benchmarks with 58.6% average. Top scores: ARC AI2 (88.8%), OpenBookQA (83.2%), HellaSwag (76.5%).

Capabilities
reasoning
75.2
#16 globally
math
17.6
#174 globally
knowledge
61.7
#38 globally
Benchmark Scores
Compare All
Tested on 10 benchmarks · Ranked across 3 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
BBH

BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.

75.2
MATH level 5

Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.

17.6
ARC AI2

AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.

88.8
OpenBookQA

Elementary science questions with access to a small book of core science facts. Tests reasoning beyond memorization.

83.2
HellaSwag

Sentence completion requiring commonsense reasoning about physical and social situations. Tests real-world understanding.

76.5
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
phi-3-medium-14b
Specifications
  • Typetext
  • ContextN/A
  • ReleasedJan 2024
  • LicenseOpen Source
  • Statusbenchmark-only
Available On
Microsoft logoMicrosoftTBD
Share & Export
Tweet
phi-3-medium 14B is an open-source text AI model by Microsoft, released in January 2024. It has an average benchmark score of 69.0.