GPT-4o Realtime

OpenAI · GPT-4o Realtime

Preview

Flagship

Latest in family

Key Metrics

Input Limit

128K tokens

Output Limit

4.1K tokens

Input Cost

$5.00/1M

Output Cost

$20.00/1M

Sample API Code

Required Libraries

Notes

This is a preview release of the GPT-4o Realtime model, the default and current version, capable of responding to audio and text inputs in realtime over WebRTC or a WebSocket interface. It supports function calling. Snapshots are available for version locking.

Capabilities

realtime text processing

realtime audio processing

function calling

WebRTC interface support

WebSocket interface support

Supported Data Types

Input Types

text

audio

Output Types

text

audio

Strengths & Weaknesses

Exceptional at

realtime multimodal interaction (text and audio)

complex understanding

Good at

function calling

fast response times

Poor at

structured outputs (not supported)

fine-tuning (not supported)

distillation (not supported)

predicted outputs (not supported)

image input/output (not supported)

Additional Information

Latest Update

Dec 17, 2024

Knowledge Cutoff

Oct 01, 2023

Similar Models

babbage-002

OpenAI

available

ChatGPT-4o

OpenAI

Available

computer-use-preview

OpenAI

Preview

Similar Capabilities

Multimodal input

13 models

Long context

13 models

Function calling

23 models