Beta
Home/Models/Phi 2
Microsoft logo

Phi 2

by Microsoft · Released Dec 2023

Open Source
31.0
avg score
Rank #173
Compare
Better than 25% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text-generation
License
Open Source
Benchmarks
14 tested
Data updated today
About

Microsoft text generation model. 1759K downloads on HuggingFace.

Tested on 14 benchmarks with 30.2% average. Top scores: ARC AI2 (67.9%), OpenBookQA (64.8%), BBH (45.9%).

Capabilities
reasoning
29.9
#81 globally
math
3.0
#200 globally
knowledge
33.9
#165 globally
language
27.4
#128 globally
general
28.0
#33 globally
Benchmark Scores
Compare All
Tested on 14 benchmarks · Ranked across 5 categories
Score Distribution (all 231 models)
0255075100
▲ You are here
BBH

BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.

45.9
MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

13.8
MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

3.0
ARC AI2

AI2 Reasoning Challenge. Grade-school science questions requiring multi-step reasoning. Easy and Challenge sets test different difficulty levels.

67.9
OpenBookQA

Elementary science questions with access to a small book of core science facts. Tests reasoning beyond memorization.

64.8
TriviaQA

Trivia questions sourced from trivia enthusiasts and quiz websites. Tests breadth of general knowledge.

45.2
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
microsoft-phi-2
Specifications
  • Typetext-generation
  • ContextN/A
  • ReleasedDec 2023
  • LicenseOpen Source
  • StatusActive
Available On
Microsoft logoMicrosoftTBD
Share & Export
Tweet
Phi 2 is an open-source text-generation AI model by Microsoft, released in December 2023. It has an average benchmark score of 31.0.