Benchmark results

Raw OGS-Bench result documents, one per model. Each file conforms to benchmark-result.schema.json and is what the leaderboard is built from.

FileSize
baseline.json976 KB
claude-haiku-4-5.json1417 KB
claude-opus-4-7.json1373 KB
claude-sonnet-4-6.json1468 KB
gemini-2.5-flash-lite.json1362 KB
gemini-2.5-flash.json1374 KB
gemini-2.5-pro.json1361 KB
gpt-4.1-mini.json1370 KB
gpt-4.1-nano.json1327 KB
gpt-4.1.json1395 KB
gpt-5.4-mini.json1354 KB
gpt-5.4-nano.json1417 KB
gpt-5.4.json1396 KB