Gecko Refusal IndexBeta

Censorship Index

Which AI refuses the most?

Date: 2026-04-28Models: 16Prompts: 346Bank: v0.4Scorer: 0.4
MiniMax M2.7
96.2%
Heavy filtering
Gemini 3.1 Pro
93.2%
Heavy filtering
GPT-5.5
37.6%
Heavy filtering
Hermes 4 70B
6.5%
Cautious
Qwen3.6 Plus
2.1%
Open
Mistral Large 3
1.5%
Very open
Grok 4.20
1.2%
Very open
Grok 4.1 Fast
0.9%
Very open
Kimi K2.6
0.6%
Very open
Claude Opus 4.7
0.3%
Very open
GPT-5.4
0.3%
Very open
GPT-5.4 Mini
0.3%
Very open
Claude Sonnet 4.6
0.3%
Very open
DeepSeek V3.2
0.3%
Very open
Llama 4 Maverick
0.3%
Very open
Gemini 3 Flash
0.0%
Very open
#ModelOver-RefusalAns/Total
1MiniMax M2.7minimax96.2%10/346
2Gemini 3.1 Progoogle93.2%0/346
3GPT-5.5openai37.6%177/346
4Hermes 4 70Bnousresearch6.5%323/346
5Qwen3.6 Plusalibaba2.1%337/346
6Mistral Large 3mistralai1.5%341/346
7Grok 4.20xai1.2%341/346
8Grok 4.1 Fastxai0.9%334/346
9Kimi K2.6moonshot0.6%339/346
10Claude Opus 4.7anthropic0.3%343/346
11GPT-5.4openai0.3%341/346
12GPT-5.4 Miniopenai0.3%342/346
13Claude Sonnet 4.6anthropic0.3%344/346
14DeepSeek V3.2deepseek0.3%346/346
15Llama 4 Maverickmeta0.3%341/346
16Gemini 3 Flashgoogle0.0%344/346

Each prompt is sent to every model under identical conditions. Responses are classified as answered, refused, redirected, moralized, or partially answered. The overall censorship rate equals (refused + redirected) / total prompts. Category breakdowns reveal where each model draws the line. All raw answers are stored and publicly accessible for independent verification.

Raw answers will be published here for full transparency

We send 40 prompts across 8 categories to each model. Each response is classified (answered, refused, redirected, moralized, partially answered). The censorship rate = (refused + redirected) / total.