GPQA Leaderboard
Compare models on GPQA benchmarks.
Rank | Model | Score | Organization | License |
---|---|---|---|---|
1 | Gemini 2.5 Pro PreviewGoogle | 84 | Proprietary | |
2 | o3-2025-04-16OpenAI | 83.3% | OpenAI | Proprietary |
3 | o3OpenAI | 83.3% | OpenAI | Proprietary |
4 | o4-miniOpenAI | 81.4% | OpenAI | Proprietary |
5 | GPT-4o miniOpenAI | 79.7% | OpenAI | Proprietary |
6 | o3-miniOpenAI | 79.7% | OpenAI | Proprietary |
7 | Gemini 2.5 Flash PreviewGoogle | 78.3 | Proprietary | |
8 | o1OpenAI | 75.7% | OpenAI | Proprietary |
9 | Claude 3.7 SonnetAnthropic | 68% | Anthropic | Proprietary |
10 | GPT-4.1OpenAI | 66.3% | OpenAI | Proprietary |
11 | Claude 3.5 SonnetAnthropic | 65% | Anthropic | Proprietary |
12 | GPT-4.1 miniOpenAI | 65% | OpenAI | Proprietary |
13 | Gemini 2.0 FlashGoogle | 62.1 | Proprietary | |
14 | Gemini 2.0 FlashGoogle | 62.1% | Proprietary | |
15 | o1-miniOpenAI | 60% | OpenAI | Proprietary |
16 | GPT-4oOpenAI | 56.1% | OpenAI | Proprietary |
17 | GPT-4oOpenAI | 56.1% | OpenAI | Proprietary |
18 | ChatGPT-4oOpenAI | 56.1% | OpenAI | Proprietary |
19 | GPT-4oOpenAI | 56.1% | OpenAI | Proprietary |
20 | GPT-4oOpenAI | 56.1% | OpenAI | Proprietary |
21 | GPT-4.1 nanoOpenAI | 50.3 | OpenAI | Proprietary |
22 | Claude 3.5 HaikuAnthropic | 41.6% | Anthropic | Proprietary |
23 | Claude 3 HaikuAnthropic | 41.6% | Anthropic | Proprietary |