Gecko Drift Index
Model Drift Index
Which models changed behavior the most this week?
Test not yet live
This test is being prepared. Data collection will begin soon. Follow @BenchGecko for launch updates.
Chart
Chart will appear here
Data collection begins when this test goes live
Model Leaderboard
| Rank | Model | Provider | Score | 7d Trend |
|---|---|---|---|---|
| Leaderboard populates when test data is collected | ||||
Methodology
The Model Drift Index requires no additional API calls. It is computed from week-over-week changes in all other Gecko Test scores. The drift magnitude per model is the root mean square of score changes across censorship, bias, political, reasoning, and moral dimensions. This is the accountability layer: when a provider quietly updates RLHF tuning, the drift index catches it.
Raw Answers
Raw answers will be published here for full transparency
Embed & Cite
Frequently Asked Questions
By comparing this week's test scores to last week's across all Gecko Tests. Large changes in censorship rate, bias symmetry, political positioning, or moral reasoning indicate the model was updated, even if the provider did not announce it.