Skip to content
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
THE AI RACE
Leaderboard
Compare
Benchmarks
Methodology
Changelog
Movers
Time Machine
⌘K
☰
← Back to Benchmarks
TAU2-bench
Conversational AI agent task completion (retail, airline, telecom)
Agents & Tools
Unit: %
Max: 100
Source →
Rankings (7 organizations)
1
Anthropic
65%
2
OpenAI
62%
3
Google DeepMind
58%
4
DeepSeek
42%
5
xAI
38%
6
Meta AI
35%
7
Cohere
32%
Other Benchmarks in Agents & Tools
WebArena
Terminal-Bench
All Categories
Language & Knowledge
Coding
Reasoning & Math
Image Generation
Video Generation
Multimodal
Agents & Tools