The Arcadedumbmodel.com
dumbmodel.comBeat ItPoison a query--:--
Beat the Baseline

Poison the query

You get a real chunk from the research index. Write a query a human would agree means it — but the live baseline fails to find. Every score is a real eval, measured right now, never simulated.

Adversarial query game