Skip to content

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.

Leaderboard Compare Benchmarks Methodology Changelog Movers Time Machine

← Back to Benchmarks

HumanEval+

Functional code correctness from docstrings (164 problems)

CodingUnit: %Max: 100Source →

Rankings (10 organizations)

89%

87.2%

86.6%

85%

83%

6Google DeepMind

79.3%

77.4%

73.8%

72%

68%

Other Benchmarks in Coding

SWE-bench Verified LiveCodeBench Aider Polyglot BigCodeBench

All Categories

Language & Knowledge Coding Reasoning & Math Image Generation Video Generation Multimodal Agents & Tools

THE AI RACETracking the global AI race

Methodology Changelog Benchmarks About

Scores reflect editorial assessment and automated benchmark data. Not investment advice. All trademarks belong to their respective owners.