Model Arena — Live Demos

9 AI models were given the same prompt: build a Game Hub with Tic-Tac-Toe and Rock-Paper-Scissors, complete with multiplayer rooms and a dark theme. These are the actual apps they built, unedited.

Claude Sonnet 4.6

Anthropic · Code: 37/40 · Tests: 11/11 · $1.42

🥇 1st Place

GPT-5.4

OpenAI · Code: 36/40 · Tests: 11/11 · $0.79

🥈 2nd Place (tie)

Claude Opus 4.6

Anthropic · Code: 36/40 · Tests: 10/11 · $5.06

🥈 2nd Place (tie)

GPT-5.3 Codex

OpenAI · Code: 35/40 · Tests: 8/11 · $0.28

4th Place (tie)

Kimi K2.5

Moonshot AI · Code: 35/40 · Tests: 11/11 · $0.50

4th Place (tie)

Gemini 3.1 Pro

Google · Code: 34/40 · Tests: 10/11 · $1.50

6th Place

MiniMax M2.5

MiniMax · Code: 33/40 · Tests: 11/11 · $0.20

7th Place

GLM-5

Zhipu AI · Code: 30/40 · Tests: 7/11 · $0.65

8th Place

Qwen 3.5 397B A17B

Alibaba · Code: 29/40 · Tests: 10/11 · $1.64

9th Place

🏟️ Model Arena