Beta
Category34 terms · BenchGecko glossary

Concepts

MoE, RAG, reasoning, quantization, tokens, agents.

Learn hub
Most-read in Concepts
Concepts
Mixture of Experts(MoE)

A model architecture that routes each token to a subset of specialized experts, so only a fraction of parameters activate per forward pass.

Read
Concepts
Reasoning Model

An LLM trained to spend extra compute at inference thinking before it answers, trading latency and cost for accuracy on hard tasks.

Read
Concepts
RAG(RAG)

Retrieval-Augmented Generation · grounds model responses in external data retrieved at query time.

Read
Concepts
MCP(MCP)

Model Context Protocol · an open standard for connecting AI models to external tools, data, and services.

Read
Concepts
Subagent

A delegated sub-task agent spawned by a parent orchestrator · each runs in isolated context with specialized tools.

Read
Concepts
Sliding Window Attention

Attention pattern where each token attends only to a local window · used in Mistral, Gemma for long-context efficiency.

Read
Concepts
ALiBi(ALiBi)

Attention with Linear Biases · position encoding via attention-score penalties instead of positional embeddings.

Read
Concepts
Mamba

State-space sequence model · linear-time alternative to transformers · foundation of Mamba-2, Jamba, Zamba models.

Read
Concepts
Hyena

Implicit-convolution alternative to attention · linear-time long-range modeling · explored by Stanford + Together.

Read
Concepts
KV Cache Compression

Techniques to shrink the key-value cache during inference · sliding windows, quantization, eviction, sparsity.

Read
Concepts
Fine-tuning

Further training a pre-trained model on domain data to adapt behavior without retraining from scratch.

Read
Concepts
Chain of Thought(CoT)

A prompting technique (and trained behavior) where the model shows step-by-step reasoning before the final answer.

Read
Everything in this category
Explore more
The Concepts category covers 34 terms. MoE, RAG, reasoning, quantization, tokens, agents. Every term has four depth levels (TL;DR, Basic, Deep, Expert), role-based takeaways, FAQs, and live BenchGecko data where available.