Home/Models/Falcon-180B
TII logo

Falcon-180B

by TII · Released Jan 2024

Open Source
49.4
avg score
Rank #110
Compare
Better than 53% of all models
Context
N/A
Input $/1M
TBD
Output $/1M
TBD
Type
text
License
Open Source
Benchmarks
17 tested
Data updated today
About

Tested on 17 benchmarks with 44.4% average. Top scores: HellaSwag (85.3%), TriviaQA (79.9%), LAMBADA (79.8%).

Capabilities
reasoning
11.8
#124 globally
math
28.6
#136 globally
knowledge
56.3
#72 globally
language
32.6
#121 globally
general
21.9
#39 globally
Benchmark Scores
Compare All
Tested on 17 benchmarks · Ranked across 5 categories
Score Distribution (all 233 models)
0255075100
▲ You are here
BBH

BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.

16.1
MUSR

HuggingFace MuSR (Multi-Step Reasoning). Tests multi-hop reasoning requiring chaining multiple facts together.

7.5
GSM8K

Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.

54.4
MATH Level 5

HuggingFace evaluation of MATH Level 5 problems. Competition math requiring advanced reasoning and proof construction.

2.8
HellaSwag

Sentence completion requiring commonsense reasoning about physical and social situations. Tests real-world understanding.

85.3
TriviaQA

Trivia questions sourced from trivia enthusiasts and quiz websites. Tests breadth of general knowledge.

79.9
LAMBADA

Language modeling benchmark testing ability to predict the last word of passages requiring long-range context understanding.

79.8
Excellent (85+) Good (70-85) Average (50-70) Below (<50)
Links
Documentation
Community
BenchGecko API
falcon-180b
Specifications
  • Typetext
  • ContextN/A
  • ReleasedJan 2024
  • LicenseOpen Source
  • Statusbenchmark-only
Available On
TII logoTIITBD
Share & Export
Tweet
Falcon-180B is an open-source text AI model by TII, released in January 2024. It has an average benchmark score of 49.4.