GPT-3.5 Turbo is OpenAI's fastest model. It can understand and generate natural language or code, and is optimized for chat and traditional completion tasks. Training data up to Sep 2021.
Tested on 13 benchmarks with 45.8% average. Top scores: TriviaQA (85.8%), ARC AI2 (83.2%), OpenBookQA (81.3%).
MiniMax M2.5 scores 61.7 (100% as good) at $0.15/1M input · 85% cheaper
Code editing benchmark from the Aider project. Measures ability to apply targeted code changes while maintaining correctness and style.
Unusual and adversarial machine learning challenges. Tests robustness of reasoning about edge cases in ML systems.
BIG-Bench Hard. 23 challenging tasks from BIG-Bench where prior language models fell below average human performance.
Grade school math word problems. 8,500 problems testing multi-step arithmetic reasoning. A foundational math benchmark.
Competition-level math from AMC, AIME, and olympiad problems. Level 5 is the hardest tier, requiring creative problem-solving.
- Typetext
- Context4K tokens (~2 books)
- ReleasedJan 2024
- LicenseProprietary
- StatusActive
- Cost / Message~$0.004