🤖 What happens when you hand the same legal case to five different AIs?
They disagree. at almost exactly the same rate human jurors do.
That's one the findings in the new chapter
@federicoast,
@williamhwgeorge, and
@robertgdean just published in AI and Arbitration (Wolters Kluwer, 2026), "When Decentralised Justice Meets Artificial Intelligence."
63 real Kleros disputes, judged by five frontier LLMs from: ChatGPT, Claude, Gemini, DeepSeek, Mistral. The takeaway isn't which model judges best. It's that you shouldn't trust a monolith AI.
Round 2 of the experiment is already in flight: the team is re-running the test on real-world consumer cases from Argentina's Junín pilot and Lemon, where early evidence suggests the AIs and human jurors come to different conclusions on the same cases.
Book details below ↓