🏟️ Model Arena

9 AI models were given the same prompt: build a Game Hub with Tic-Tac-Toe and Rock-Paper-Scissors, complete with multiplayer rooms and a dark theme. These are the actual apps they built, unedited.

Claude Sonnet 4.6

Anthropic · Code: 37/40 · Tests: 11/11 · $1.42

🥇 1st Place

Claude Opus 4.6

Anthropic · Code: 36/40 · Tests: 10/11 · $5.06

🥈 2nd Place

GPT-5.3 Codex xHigh

OpenAI · Code: 35/40 · Tests: 8/11 · $0.28

🥉 3rd Place (tie)

Kimi K2.5 Thinking

Moonshot AI · Code: 35/40 · Tests: 11/11 · $0.50

🥉 3rd Place (tie)

MiniMax M2.5

MiniMax · Code: 33/40 · Tests: — · $0.20

5th Place

GLM-5

Zhipu AI · Code: 30/40 · Tests: 7/11

6th Place

Qwen 3.5-Plus

Alibaba · Code: 29/40 · Tests: 10/11

7th Place

Gemini 3.1 Pro

Google · Build failed in R2

8th Place

DeepSeek V3.2 Exp

DeepSeek · Code: 22/40 · Build failed in R2

9th Place