Are embeddings the same as tokens?

No. Tokens are the input units LLMs read. Embeddings are high-dimensional vectors capturing meaning. An embedding model takes tokens and outputs a vector.

Can I use embeddings without RAG?

Yes · for semantic search, clustering, anomaly detection, classification. RAG is one application of many.

ConceptsReading · ~3 min · 75 words deep

Embedding

A dense numerical vector (256 to 8192 dimensions) that captures the semantic meaning of text, images, or audio.

TL;DR

A dense numerical vector (256 to 8192 dimensions) that captures the semantic meaning of text, images, or audio.

Level 1

Basic

An embedding turns text into a fixed-size vector of numbers. Similar meanings produce nearby vectors · "dog" and "puppy" map to similar points, "dog" and "car" to distant points. Embeddings are the foundation of RAG, semantic search, clustering, and recommendations. Production embedding models: text-embedding-3-large (OpenAI), Voyage 3, Cohere embed, NV-Embed (NVIDIA).

Level 2

Deep

Modern embedding models are trained with contrastive learning: pairs of semantically related text (e.g., query and relevant doc) are pulled close in embedding space while random pairs are pushed apart. Dimensions range from 256 (fast) to 8192 (highest quality). Cosine similarity is the dominant distance metric. Good embedding models hit 70%+ MTEB leaderboard scores. Costs: $0.05-0.20/M tokens on the major APIs. Embedding a large corpus is a one-time cost; serving queries is fast and cheap.

Level 3

Expert

Training objective: contrastive NT-Xent loss or InfoNCE. Hard negative mining (sampling confusingly-similar negatives) improves retrieval quality. Dimensionality trade-offs: 1536D gives strong retrieval on MTEB but 8GB per million items in memory. Matryoshka embeddings (truncate to shorter dims at query time) give flexibility. Domain adaptation via in-domain fine-tuning improves retrieval 10-30% on specialized corpora. Evaluated with MTEB (Massive Text Embedding Benchmark) across 56 tasks. Top 2026: NV-Embed-v2 (73.7), Voyage-3 (~73), text-embedding-3-large (~70).

The takeaway for you

Depending on why you're here

If you are a

Researcher

·Contrastive learning with hard negatives
·MTEB is the canonical benchmark · 56 task suite
·Matryoshka embeddings enable dimensionality flexibility

If you are a

Builder

·Use OpenAI text-embedding-3-large or Voyage 3 for most RAG
·Rerank top-20 retrieved results with a cross-encoder for quality
·Domain fine-tuning pays off for specialized corpora

If you are a

Investor

·Embedding infra is commoditizing · pricing dropped 10× since 2023
·Vector DB market consolidating · pgvector + MongoDB Atlas winning
·Rerankers and hybrid search are the new premium tier

If you are a

Curious · Normie

·Turns words into numbers so computers can find similar meanings
·Why AI can understand "cheap phone" and match you to iPhone-SE
·The foundation of AI search

Gecko's take

Embeddings are commodity infrastructure in 2026. Pick any top-5 provider and move on · the real moat is retrieval quality and reranking.

Frequently Asked Questions

NV-Embed-v2 and Voyage-3 lead MTEB. OpenAI text-embedding-3-large is the default production choice for most teams.

Embedding

Basic

Deep

Expert

Depending on why you're here

Frequently Asked Questions

Related terms

Glossary

Explore live data

Cite or embed