All terms · A-Z
Every tracked term. Filter by letter below. Click a term for the 10-module detail page.
#
2 termsA
32 termsNVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
An AI system that plans, uses tools, and takes multi-step actions to accomplish goals · not just a chat turn.
The research field focused on making AI systems pursue intended goals safely and reliably.
BenchGecko's 0-1000% composite score measuring AI sector valuation vs fundamentals.
The capital expenditure AI labs and hyperscalers spend on chips, datacenters, and training clusters.
model_provider · valuation $NaNB · international.
AI researcher.
Paul Gauthier's terminal-based git-aware AI pair programmer · commits with context of the whole repo.
Aider polyglot is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
model_provider · valuation $NaNB · international.
Attention with Linear Biases · position encoding via attention-score penalties instead of positional embeddings.
public_big_tech · valuation undisclosed · international.
public_big_tech · valuation undisclosed · international.
infrastructure · valuation $NaNB · international.
AMD's next-gen data-center GPU · late 2025 announcement · targets H200/Blackwell competition with HBM4.
AI researcher.
ANLI is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
model_provider · valuation $NaNB · international.
infrastructure · valuation $NaNB · international.
developer_tools · valuation $NaNB · international.
APEX-Agents is a agentic benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
Running the same open-weight model on multiple providers to exploit price differences · up to 30× savings.
ARC AI2 is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
ARC-AGI is a reasoning benchmark tracked by BenchGecko across every frontier and open-weight model.
ARC-AGI-2 is a reasoning benchmark tracked by BenchGecko across every frontier and open-weight model.
Annualized Recurring Revenue · the standardized metric SaaS and AI companies report to investors.
AI researcher.
Huawei's ? TFLOPS AI accelerator with ? GB HBM.
infrastructure · valuation $NaNB · international.
AutoGen is Microsoft's open-source framework for building multi-agent AI applications.
AutoGPT is the open-source autonomous agent that kicked off the entire category.
B
11 termsNVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
NVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
model_provider · valuation $NaNB · international.
Balrog is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
Submitting many prompts at once in a single batch job · providers discount 50% on delayed batch completion.
BBH is a reasoning benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
Net cash burned divided by net new ARR · < 1× is healthy · AI startups often run 2-5× pre-scale.
Monthly cash outflow · the speed at which an AI company spends investor capital before hitting breakeven.
Bring Your Own Key · you pay the model provider directly, the app just routes your requests.
C
29 termsThe fraction of input tokens served from provider-side prompt cache · directly impacts effective pricing.
Discounted input tokens for prompts the provider has already processed · 50-90% off on repeat prefix.
CadEval is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
A prompting technique (and trained behavior) where the model shows step-by-step reasoning before the final answer.
application · valuation $NaNB · international.
Crowdsourced head-to-head AI model comparison · humans vote on anonymous outputs and Elo ratings rank the models.
Chess Puzzles is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
Anthropic's foundation-model family · Opus / Sonnet / Haiku tiers plus Extended Thinking variants.
Claude Code is a command-line coding agent built by Anthropic that lives in your terminal and edits files, runs commands, and searches your codebase with direct Claude model access.
AI researcher.
Open-source VSCode coding agent · forked as Roo Code and Continue, built on user-supplied API keys.
OpenAI's open-source terminal coding agent · runs GPT models against your repo from the shell.
Sourcegraph's enterprise coding AI · deep codebase context from graph-aware search.
model_provider · valuation $NaNB · international.
The maximum number of tokens a model can process in one request · input prompt plus generated output.
Continue is an open-source IDE extension for building your own AI code assistant with custom models, context providers, and slash commands.
Open-source IDE-native coding assistant · inline autocomplete + chat across VSCode and JetBrains.
application · valuation $NaNB · international.
infrastructure · valuation $NaNB · international.
TSMC's next-gen advanced packaging · enables 12+ HBM stacks and multi-die GPUs · used on B300, MI400.
TSMC's previous-gen advanced packaging · used on H100 / A100 · being phased out for CoWoS-L.
CrewAI is a framework for orchestrating autonomous AI agents with roles, goals, and tools.
infrastructure · valuation $NaNB · international.
Cerebras appliance AI system.
CSQA2 is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
Cursor is a fork of VS Code rebuilt around an AI pair programmer.
% of revenue from top customers · AI labs often 40-60% from top 10 vs SaaS median <20%.
Cybench is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
D
14 termsAI researcher.
The requirement that data be stored and processed in a specific geographic region.
infrastructure · valuation $NaNB · international.
? GB/s memory at ? GB per stack · from major vendors.
Next-gen DRAM standard · 8.8-17.6 Gbps · used in servers + CPUs feeding GPU clusters · JEDEC finalized 2025.
DeepResearch Bench is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
Hangzhou-based lab behind V3 and R1 · the MoE + RL-reasoning recipe that crashed open-weight prices.
AI researcher.
Cognition AI's autonomous software engineer agent · browses, writes, and deploys code end-to-end.
NVIDIA node AI system.
NVIDIA rack AI system.
NVIDIA rack AI system.
Reduction in existing shareholder ownership % after new equity issuance · typical 15-25% per round.
Training a small model to mimic a large one, preserving most quality at a fraction of the size.
E
5 termsapplication · valuation $NaNB · international.
AI researcher.
A dense vector representation of text, image, or audio in high-dimensional space · the backbone of RAG and search.
Employee Stock Ownership Plan · pool of shares set aside for employees · typical 10-20% of cap table.
EU regulation classifying AI systems by risk level and imposing compliance obligations on developers and deployers.
F
9 termsUS federal government authorization for cloud and AI services handling agency data.
Fiction.
Further training a pre-trained model on domain data to adapt behavior without retraining from scratch.
infrastructure · valuation $NaNB · international.
The no-cost tier of an AI API · usually rate-limited, quota-capped, or branded-output-only.
FrontierMath-2025-02-28-Private is a math benchmark tracked by BenchGecko across every frontier and open-weight model.
FrontierMath-Tier-4-2025-07-01-Private is a math benchmark tracked by BenchGecko across every frontier and open-weight model.
Tool-use / function-call responses count as output tokens · structured JSON tool calls are billed normally.
The API mechanism letting models emit structured JSON to invoke external functions.
G
25 termsIntel's ? TFLOPS AI accelerator with ? GB HBM.
NVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
NVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
? GB/s memory at ? GB per stack · from major vendors.
Next-gen graphics DRAM · 32-40 Gbps · powers consumer + pro GPUs like RTX 5090, RTX 6000 Ada next-gen.
EU data protection regulation · imposes strict consent, residency, and rights requirements on any service processing EU data.
Google DeepMind's foundation-model family · multimodal from day one, long context leader.
GeoBench is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
AI researcher.
GitHub Copilot is the AI pair programmer from GitHub and Microsoft.
application · valuation $NaNB · international.
United States chip foundry · advanced nodes · $7.4B revenue.
Gross Merchandise Value · total dollar value of transactions · used for marketplace AI products.
model_provider · valuation $NaNB · international.
Block's open-source local coding agent · MCP-first, offline-capable, built by the company behind Square.
GPQA diamond is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
OpenAI's Generative Pre-trained Transformer lineage · GPT-1 through GPT-5 and onward.
GPT Engineer generates an entire codebase from a natural language spec.
infrastructure · valuation $NaNB · international.
Revenue minus inference + training + data costs · AI labs at 50-70%, vs pure SaaS at 80-90%.
Tethering AI responses to verified source material · RAG, citations, tool calls, retrieval.
GSM8K is a math benchmark tracked by BenchGecko across every frontier and open-weight model.
GSO-Bench is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
Runtime filters and policies enforced around model input and output · the safety layer outside the model itself.
AI researcher.
H
17 termsNVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
NVIDIA's ? TFLOPS AI accelerator with ? GB HBM.
When an AI generates plausible-sounding but factually incorrect or fabricated content.
AI researcher.
application · valuation $NaNB · international.
? GB/s memory at ? GB per stack · from major vendors.
? GB/s memory at ? GB per stack · from major vendors.
? GB/s memory at ? GB per stack · from major vendors.
? GB/s memory at ? GB per stack · from major vendors.
HellaSwag is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
NVIDIA node AI system.
US law governing the handling of Protected Health Information (PHI) by healthcare providers and their vendors.
HLE is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
developer_tools · valuation $NaNB · international.
A 164-problem Python benchmark where the model writes a function from its docstring and passes unit tests.
Wafer-to-wafer bonding technique that skips bumps · enables 3D stacking for HBM4 and advanced packaging.
Implicit-convolution alternative to attention · linear-time long-range modeling · explored by Stanford + Together.
I
8 termsRunning a trained model to generate outputs from new inputs · what happens every time you call an AI API.
AWS's third-gen inference chip · optimized for serving LLMs at scale with HBM3 and NeuronCore v3.
model_provider · valuation $NaNB · international.
Tokens you send to the model · priced separately from output tokens, usually cheaper.
infrastructure · valuation $NaNB · international.
United States chip foundry · advanced nodes · $0.9B revenue.
The network of capital flowing into AI · which funds + corporates dominate which tier of deals.
infrastructure · valuation $NaNB · international.
J
3 termsK
1 termsL
12 termsdeveloper_tools · valuation $NaNB · international.
LAMBADA is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
developer_tools · valuation $NaNB · international.
LangGraph is LangChain's framework for building stateful, multi-actor agents as graphs.
Time to first token · how fast the AI starts responding. Critical for interactive UX.
Lech Mazur Writing is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
A contamination-resistant benchmark that refreshes tasks monthly to prevent models from memorizing answers.
Meta's open-weight foundation-model family · the dominant open-source LLM lineage.
? GB/s memory at ? GB per stack · from major vendors.
Groq's ? TFLOPS AI accelerator with ? GB HBM.
Lifetime Value divided by Customer Acquisition Cost · >3× is healthy · AI consumer apps tracking 2-4×.
M
29 termsMicrosoft's ? TFLOPS AI accelerator with ? GB HBM.
Microsoft rack AI system.
State-space sequence model · linear-time alternative to transformers · foundation of Mamba-2, Jamba, Zamba models.
Butterfly Effect Inc's autonomous general-purpose agent · went viral March 2025 for multi-step research and web automation.
infrastructure · valuation $NaNB · international.
MATH level 5 is a math benchmark tracked by BenchGecko across every frontier and open-weight model.
Model Context Protocol · an open standard for connecting AI models to external tools, data, and services.
model_provider · valuation $NaNB · international.
public_big_tech · valuation undisclosed · international.
MetaGPT models a software company with product managers, architects, engineers and QAs.
AMD's ? TFLOPS AI accelerator with ? GB HBM.
AMD's ? TFLOPS AI accelerator with ? GB HBM.
AMD node AI system.
AMD's ? TFLOPS AI accelerator with ? GB HBM.
infrastructure · valuation $NaNB · international.
public_big_tech · valuation undisclosed · international.
Microsoft's second-gen custom AI chip · successor to Maia 100 · targets Azure OpenAI inference at scale.
application · valuation $NaNB · international.
model_provider · valuation $NaNB · international.
Paris-based AI lab · open-weight (Mistral, Mixtral) and closed (Large) tiers, EU data sovereignty focus.
model_provider · valuation $NaNB · international.
A model architecture that routes each token to a subset of specialized experts, so only a fraction of parameters activate per forward pass.
MMLU is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
A harder version of MMLU with 10 answer choices, filtered noise, and more reasoning-heavy questions.
infrastructure · valuation $NaNB · international.
model_provider · valuation $NaNB · international.
Meta's ? TFLOPS AI accelerator with ? GB HBM.
Per-token pricing differs by input type · text tokens vs image tokens vs audio seconds vs video frames.
Models that process and generate multiple types of data · text, image, audio, video.
N
2 termsO
7 termsmodel_provider · valuation $NaNB · international.
OpenBookQA is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
OpenHands (formerly OpenDevin) is a community-built autonomous software development agent that matches frontier closed-source agents on SWE-bench Verified.
OpenAI's computer-use agent · clicks, types, and browses websites on the user's behalf.
OSWorld is a agentic benchmark tracked by BenchGecko across every frontier and open-weight model.
OTIS Mock AIME 2024-2025 is a math benchmark tracked by BenchGecko across every frontier and open-weight model.
Tokens the model generates · priced 3-5× higher than input tokens due to sequential decoding.
P
9 termsPrice-to-Sales ratio · valuation divided by annual revenue. The AI sector premium metric.
A flat fee per API call regardless of tokens · used for image generation, search, and some agent products.
application · valuation $NaNB · international.
developer_tools · valuation $NaNB · international.
PIQA is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
Plandex is an open-source terminal AI coding agent optimized for large, multi-file tasks.
PostTrainBench is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
The first and most expensive phase of LLM creation · training on trillions of tokens of internet text.
The practice of crafting model inputs to elicit better outputs · cheaper than fine-tuning, more reliable than hope.
Q
3 termsR
12 termsRetrieval-Augmented Generation · grounds model responses in external data retrieved at query time.
Anti-dilution provision that adjusts conversion price if a later round is at lower valuation · full vs weighted-average.
An LLM trained to spend extra compute at inference thinking before it answers, trading latency and cost for accuracy on hard tasks.
Reasoning models bill thinking tokens separately from output · can 2-10× effective cost per query.
model_provider · valuation $NaNB · international.
infrastructure · valuation $NaNB · international.
developer_tools · valuation $NaNB · international.
Replit's in-browser AI coding agent · turns a prompt into a deployed app without leaving the browser.
Pre-purchased dedicated throughput · flat hourly fee for guaranteed tokens-per-second from a specific model.
Annual revenue divided by headcount · leverage metric for AI labs where the ratio hits $3M+ vs SaaS at $300K.
VSCode coding agent forked from Cline · adds auto-mode, multi-tab orchestration, and Orchestrator agent pattern.
application · valuation $NaNB · international.
S
26 termsAI researcher.
infrastructure · valuation $NaNB · international.
infrastructure · valuation $NaNB · international.
South Korea chip foundry · advanced nodes · $12.5B revenue.
developer_tools · valuation $NaNB · international.
The empirical relationship between compute, data, parameters, and model capability.
ScienceQA is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
AI researcher.
SimpleBench is a reasoning benchmark tracked by BenchGecko across every frontier and open-weight model.
SimpleQA Verified is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
infrastructure · valuation $NaNB · international.
Attention pattern where each token attends only to a local window · used in Mistral, Gemma for long-context efficiency.
China chip foundry · advanced nodes · $7.8B revenue.
Smolagents is Hugging Face's minimalist agent library.
developer_tools · valuation $NaNB · international.
An audit framework certifying that a provider handles customer data with defined security controls.
Preemptible GPU capacity at deep discount · AWS Spot, GCP Spot, Azure Spot · 60-90% off on-demand but can be evicted.
model_provider · valuation $NaNB · international.
model_provider · valuation $NaNB · international.
A delegated sub-task agent spawned by a parent orchestrator · each runs in isolated context with specialized tools.
SWE-agent is the research agent from the Princeton NLP group behind SWE-bench.
The real-world coding benchmark · AI resolves actual GitHub issues in open-source Python repos.
SWE-Bench verified is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
Sweep is an AI junior developer that turns bug reports and feature requests into code changes as pull requests, running entirely from GitHub issues.
AI researcher.
application · valuation $NaNB · international.
T
21 termsSelf-hosted AI coding assistant · Copilot-style autocomplete you can run on your own GPU.
An inference parameter controlling output randomness · 0 = deterministic, higher = more varied.
Terminal Bench is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
The Agent Company is a agentic benchmark tracked by BenchGecko across every frontier and open-weight model.
Total tokens per second a serving cluster can handle across all concurrent requests.
Volume discounts at defined usage thresholds · $X/M tokens below 1B tokens, $Y/M above.
infrastructure · valuation $NaNB · international.
The algorithm that splits text into tokens before a model reads it · BPE, SentencePiece, Tiktoken.
The fundamental units language models read and generate · roughly 3/4 of an English word per token.
A model invoking external functions or APIs during generation · the foundation of AI agents.
Google's ? TFLOPS AI accelerator with ? GB HBM.
Google pod AI system.
Google's ? TFLOPS AI accelerator with ? GB HBM.
Google pod AI system.
Google's ? TFLOPS AI accelerator with ? GB HBM.
AWS's third-gen custom training chip · 2× perf/watt vs Trainium 2 · powering Anthropic training clusters on AWS.
AWS's ? TFLOPS AI accelerator with ? GB HBM.
The neural network architecture (Vaswani et al., 2017) behind every modern large language model.
TriviaQA is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
AWS node AI system.
infrastructure · valuation $NaNB · international.
V
3 termsVideoMME is a multimodal benchmark tracked by BenchGecko across every frontier and open-weight model.
Pre-negotiated flat rate applied to all consumption · typical enterprise AI contract shape.
VPCT is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
W
7 termsdeveloper_tools · valuation $NaNB · international.
developer_tools · valuation $NaNB · international.
WeirdML is a coding benchmark tracked by BenchGecko across every frontier and open-weight model.
Codeium's VSCode-fork IDE with "Cascade" multi-file agent · acquired by Cognition in 2025.
Windsurf is Codeium's purpose-built agentic IDE with Cascade, a flow-state collaborative agent that keeps context across your whole project.
Winogrande is a knowledge benchmark tracked by BenchGecko across every frontier and open-weight model.
Cerebras's ? TFLOPS AI accelerator with ? GB HBM.