1
ConceptsChapter 1 of 1
Inference
What they are used for second (and permanently).
TL;DR
The process of running a trained model to generate predictions · every API call is inference.
“Inference optimization is where the next 10× cost reduction lives. Every frontier lab is racing to ship the best serving stack.”
Read full chapter