Beta
Model familiesReading · ~3 min · 71 words deep

Gemini

Google DeepMind's LLM family · multimodal, long-context (2M+ tokens), runs on TPU infrastructure.

Gemini family on /family
TL;DR

Google DeepMind's LLM family · multimodal, long-context (2M+ tokens), runs on TPU infrastructure.

Level 1

Gemini was launched in December 2023 as Google's response to GPT-4. Tiers: Ultra (flagship), Pro (balanced workhorse), Flash (fast, cheap), Flash Lite (cheapest). Version cadence: Gemini 1.0 (2023), 1.5 (2024, introduced 1M+ context), 2.0 (2025), 2.5 (2025), 3.0 (2026). Key differentiators: native multimodality (image, audio, video), longest production context windows (2M+ tokens), and TPU-based serving economics.

Level 2

Gemini is a mixture-of-experts transformer trained on Google's TPU infrastructure. Multimodal training is native · text, image, audio, video tokens share a unified embedding space. Gemini 1.5 Pro was the first production model with 1M+ token context (and experimental 10M internal). Context handling uses ring attention for O(n) scaling. Post-training: Constitutional AI-like approaches plus Google's proprietary RLHF stack. Gemini integrates tightly with Google's product ecosystem · Google Workspace, Android, Chrome, search.

Level 3

Gemini 1.5 Pro used a MoE architecture with ring attention for long context. Ring attention splits the sequence across multiple devices and rotates keys/values, enabling linear scaling to 10M+ tokens. Trained on Google's TPU v5p and TPU v6 (Trillium) clusters · Google's vertical integration gives them compute-cost advantage at scale. Multimodal pretraining uses mixed-modality batches · each step sees text + image + audio + video tokens. Gemini 2.5 Pro introduced Extended Thinking (like OpenAI o-series and Claude ET). Gemini models consistently lead on benchmarks requiring multimodal + long-context reasoning.

Why this matters now

Gemini 3 Pro launched with 2M token context as default and the lowest enterprise tier pricing in the frontier class, pressuring GPT-5 and Claude pricing.

The takeaway for you
If you are a
Researcher
  • ·MoE architecture + ring attention for long context
  • ·Native multimodal pretraining
  • ·TPU-trained · v5p and v6 (Trillium)
If you are a
Builder
  • ·Use Gemini 2.5 Pro for long document / long video understanding
  • ·Gemini 2.5 Flash for cheap/fast · best price-performance on most tasks
  • ·Integration with Google Workspace is the enterprise differentiator
If you are a
Investor
  • ·Google's TPU vertical integration provides compute-cost moat
  • ·Workspace + Android + Chrome distribution is unmatched
  • ·Gemini pricing pressure reshapes frontier market · benefits Google's ad/cloud businesses
If you are a
Curious · Normie
  • ·Google's answer to ChatGPT
  • ·Handles images, audio, and video · not just text
  • ·Has a much bigger memory than other AIs
Gecko's take

Gemini's long-context and multimodal defaults are the reason every other frontier model scrambled to match. Google's TPU moat is real.

Depends on workload. Gemini leads on multimodal and long-context tasks. GPT-5 and Claude 4.5 Opus lead on reasoning and coding. Arena rankings swap month to month.