Skip to content
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
← Back to Benchmarks
GPQA Diamond
PhD-level science questions (198 expert-validated)
Language & Knowledge
Unit: %
Max: 100
Source →
Rankings (13 organizations)
1
Google DeepMind
94.3%
2
Anthropic
91.3%
3
OpenAI
78%
4
DeepSeek
73.2%
5
xAI
68.5%
6
Meta AI
67%
7
Mistral
64.5%
8
Alibaba Qwen
63%
9
Zhipu AI
58%
10
Cohere
56%
11
ByteDance Seed
55%
12
Baidu Ernie
52%
13
Tencent Hunyuan
48%
Other Benchmarks in Language & Knowledge
Chatbot Arena ELO
MMLU-Pro
SimpleQA
Humanity's Last Exam
IFEval
All Categories
Language & Knowledge
Coding
Reasoning & Math
Image Generation
Video Generation
Multimodal
Agents & Tools