PromptTriage
Research
Data-driven studies on prompt engineering and LLM behavior. All evaluations scored by a multi-model jury on a 100-point scale.
Study ERead →
AI Format Wars
Does the shape of your prompt matter?
1,080 evals · 5 models · 3-judge jury
Study CComing soon
The Prompt Invariance Illusion
Does your system prompt actually matter?
104 evals · 2 models · 3-judge jury
AnalysisComing soon
170 System Prompts from Top AI Companies
3 critical anti-patterns found in production systems.
170 prompts analyzed
Datasets and scripts are open-source on GitHub.