Back to all models

GPT-4o Realtime

OpenAI · GPT-4o Realtime

Preview
Flagship
Latest in family

Key Metrics

Input Limit

128K tokens

Output Limit

4.1K tokens

Input Cost

$5.00/1M

Output Cost

$20.00/1M

Sample API Code

Required Libraries

Notes

This is a preview release of the GPT-4o Realtime model, the default and current version, capable of responding to audio and text inputs in realtime over WebRTC or a WebSocket interface. It supports function calling. Snapshots are available for version locking.

Capabilities

realtime text processing
realtime audio processing
function calling
WebRTC interface support
WebSocket interface support

Supported Data Types

Input Types

text
audio

Output Types

text
audio

Strengths & Weaknesses

Exceptional at

realtime multimodal interaction (text and audio)
complex understanding

Good at

function calling
fast response times

Poor at

structured outputs (not supported)
fine-tuning (not supported)
distillation (not supported)
predicted outputs (not supported)
image input/output (not supported)

Additional Information

Latest Update

Dec 17, 2024

Knowledge Cutoff

Oct 01, 2023