Back to all models
Get all the details on o3, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as function_calling, long_context, multimodal_input, available API code samples, and performance strengths.
Key Metrics
Input Limit
200K tokens
Output Limit
100K tokens
Input Cost
$10.00/1M
Output Cost
$40.00/1M
Sample API Code
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="o3",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Hello!"}
]
)
print(response.choices[0].message.content)
Required Libraries
openai
openai
Benchmarks
Benchmark | Score | Source | Notes |
---|---|---|---|
1413 | lmarena.ai | - | |
1187 | lmarena.ai | - | |
1302 | lmarena.ai | - | |
83.3 | Vellum LLM Leaderboard | Score in percentage (%) | |
91.6 | Vellum LLM Leaderboard | Score in percentage (%) | |
69.1 | Vellum LLM Leaderboard | Score in percentage (%) | |
81.3 | Vellum LLM Leaderboard | Score in percentage (%) | |
80.71 | LiveBench | Score for 'o3 High' | |
93.33 | LiveBench | Score for 'o3 High' | |
76.71 | LiveBench | Score for 'o3 High' | |
85.00 | LiveBench | Score for 'o3 High' | |
67.02 | LiveBench | Score for 'o3 High' | |
76.00 | LiveBench | Score for 'o3 High' | |
86.17 | LiveBench | Score for 'o3 High' | |
79.25 | LiveBench | Score for 'o3 Medium' | |
91.00 | LiveBench | Score for 'o3 Medium' | |
77.86 | LiveBench | Score for 'o3 Medium' | |
80.66 | LiveBench | Score for 'o3 Medium' | |
68.19 | LiveBench | Score for 'o3 Medium' | |
73.48 | LiveBench | Score for 'o3 Medium' | |
84.32 | LiveBench | Score for 'o3 Medium' | |
n/a | Vellum LLM Leaderboard | - | |
n/a | Vellum LLM Leaderboard | - | |
n/a | Vellum LLM Leaderboard | - |
Notes
o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following. Use it to think through multi-step problems that involve analysis across text, code, and images. Supports Flex Processing pricing.
Capabilities
function calling
long context
multimodal input
reasoning token support
streaming
structured outputs
Supported Data Types
Input Types
text
image
Output Types
text
Strengths & Weaknesses
Exceptional at
math
science
coding
visual reasoning
technical writing
instruction following
multi step problem solving across modalities
Good at
general reasoning
well rounded performance
Additional Information
Latest Update
Apr 16, 2025
Knowledge Cutoff
Jun 1, 2024