Skip to content

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

← Back to Benchmarks

GPQA Diamond

PhD-level science questions (198 expert-validated)

Language & KnowledgeUnit: %Max: 100Source →

Rankings (13 organizations)

1Google DeepMind

94.3%

91.3%

78%

73.2%

68.5%

67%

64.5%

63%

58%

56%

11ByteDance Seed

55%

52%

13Tencent Hunyuan

48%

Other Benchmarks in Language & Knowledge

Chatbot Arena ELO MMLU-Pro SimpleQA Humanity's Last Exam IFEval

All Categories

Language & Knowledge Coding Reasoning & Math Image Generation Video Generation Multimodal Agents & Tools

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.