Agents are shipping faster than the evals that gate them.
/simulate_ tests products the way a user would — catching the failures evals miss.
/simulate_

🔋
Page screenshot