LiveBench IF Leaderboard
Compare models on LiveBench IF benchmarks.
Rank | Model | Score | Organization | License |
---|---|---|---|---|
1 | o3OpenAI | 86.17 | OpenAI | Proprietary |
2 | o3OpenAI | 86.17 | OpenAI | Proprietary |
3 | Gemini 2.0 FlashGoogle | 85.79 | Proprietary | |
4 | Gemini 2.0 FlashGoogle | 85.79 | Proprietary | |
5 | o4-miniOpenAI | 84.96 | OpenAI | Proprietary |
6 | Gemini 2.5 Pro PreviewGoogle | 83.50 | Proprietary | |
7 | o1OpenAI | 79.88 | OpenAI | Proprietary |
8 | Gemini 2.5 Flash PreviewGoogle | 79.02 | Proprietary | |
9 | GPT-4.1OpenAI | 77.05 | OpenAI | Proprietary |
10 | Gemini 2.0 Flash-LiteGoogle | 76.63 | Proprietary | |
11 | Claude 3.7 SonnetAnthropic | 76.49 | Anthropic | Proprietary |
12 | ChatGPT-4oOpenAI | 71.92 | OpenAI | Proprietary |
13 | GPT-4.1 miniOpenAI | 70.31 | OpenAI | Proprietary |
14 | Claude 3.5 SonnetAnthropic | 69.30 | Anthropic | Proprietary |
15 | GPT-4 TurboOpenAI | 64.94 | OpenAI | Proprietary |
16 | GPT-4o 2024-05-13OpenAI | 64.94 | OpenAI | Proprietary |
17 | GPT-4oOpenAI | 64.94 | OpenAI | Proprietary |
18 | GPT-4oOpenAI | 64.94 | OpenAI | Proprietary |
19 | GPT-4oOpenAI | 64.94 | OpenAI | Proprietary |
20 | GPT-3.5 Turbo (0125)OpenAI | 64.94 | OpenAI | Proprietary |
21 | Claude 3.5 HaikuAnthropic | 61.88 | Anthropic | Proprietary |
22 | Claude 3 HaikuAnthropic | 61.88 | Anthropic | Proprietary |
23 | Claude 3 HaikuAnthropic | 61.88 | Anthropic | Proprietary |
24 | GPT-4.1 nanoOpenAI | 57.54 | OpenAI | Proprietary |
25 | GPT-4o miniOpenAI | 56.80 | OpenAI | Proprietary |