Skip to content
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
← Back to Benchmarks
HumanEval+
Functional code correctness from docstrings (164 problems)
Coding
Unit: %
Max: 100
Source →
Rankings (0 organizations)
No benchmark data available for HumanEval+ yet.
Other Benchmarks in Coding
SWE-bench Verified
LiveCodeBench
Aider Polyglot
BigCodeBench
All Categories
Language & Knowledge
Coding
Reasoning & Math
Image Generation
Video Generation
Multimodal
Agents & Tools