Same Prompts. Same Models. Raw Answers.
Powered by GeckoBench, BenchGecko's proprietary AI behavior benchmark.
Daily tests covering censorship, race bias, political orientation, IQ, rules vs human survival, real-life judgment, and model drift.
16 frontier & widely used models · 7 tests prepared · Censorship Index launching first · raw answers public after each run
BenchGecko asks the questions people actually worry about: what AI refuses, who it protects, what it believes, and whether it changes over time.
Gecko Tests Status
Launching first
Censorship Index
Models prepared
16
Prompt set
v0.1
Raw answers
Public after first run
Next
Political Compass · Race Bias
Today's question
Which AI refuses the most? First live test: Censorship Index.
Censorship Index
Which AI refuses the most?
View testAI Political Compass
Where does each AI model sit politically?
View testRace Bias Index
Does the model treat identical race-swapped scenarios differently?
View testGender Safety Bias Index
Does AI take men and women equally seriously when they are scared?
View testWould AI Let People Die?
Does the model choose rules or human survival?
View testAI IQ Test
Which AI model reasons best?
View testReal-Life AI Test
Does the model give useful advice in real situations?
View testPlanet vs People Index
Does AI prioritize environmental goals over human welfare?
View testModel Drift Index
Which models changed behavior the most this week?
View testMore Gecko Tests(8)
Religion Bias Index
Does AI protect some religions more than others?
View testLGBT Debate Openness Index
Does AI allow good-faith debate on LGBT issues?
View testIdeology Bias Index
Does AI apply the same standard to capitalism, communism, left, and right?
View testHistory Integrity Index
Does the model preserve historical facts under political pressure?
View testLand & Migration Double Standard Test
Does the model apply the same standard to historical settlement and modern migration?
View testVictims vs Criminals Test
Does AI balance offender rights, victim safety, and law-abiding citizens?
View testSlur Double Standard Test
Does the model enforce hate-speech rules equally?
View testCreative Freedom Index
Does AI allow serious fiction, satire, and historical writing?
View testMethodology
Every Gecko Test sends the same prompt set to each model using pinned model IDs and recorded provider routes. During MVP, runs are routed through OpenRouter. For each response, BenchGecko records the model ID, provider route when available, timestamp, request parameters, token usage, and raw answer. BenchGecko does not add hidden steering prompts. Unless a test specifies otherwise, runs use fixed decoding settings, capped output length, and recorded request parameters for reproducibility.
Responses are scored with deterministic rules first: refusal phrases, answer completeness, warning language, redirects, and direct-answer detection. Ambiguous cases are reviewed by an LLM judge using a fixed rubric. Monthly reports include manual audit samples and scorer version numbers. Raw answers remain available so readers can verify or dispute the classification.
prompt set version: recorded
model ID / version: recorded
provider route: recorded
temperature: fixed at 0 where supported
max output tokens: capped (120)
tools / web access: disabled
raw answers: archived & public
scorer version: recorded
Models are tested on a tiered schedule: Tier 1 (frontier) daily, Tier 2 (strong) twice per week, Tier 3 (open source) weekly. Budget guards prevent runaway costs.
Embed & Cite
Every live Gecko Test chart will be free to embed. Copy the iframe snippet below and paste it into your article, dashboard, or blog. Attribution link required.
<iframe
src="https://benchgecko.ai/embed/gecko-tests/censorship-index"
width="600" height="400"
frameborder="0"
title="AI Censorship Index · BenchGecko Labs"
></iframe>
<p style="font-size:12px;color:#888">
Data: GeckoBench by
<a href="https://benchgecko.ai/gecko-tests/censorship-index">
BenchGecko AI Censorship Index</a>
· Updated daily
</p>For journalists, researchers & creators
Use BenchGecko charts in articles, newsletters, videos, and reports. Every chart includes a citation, embed code, PNG/SVG export, and raw answer archive.