Logo

AI Models with Multimodal Output Support

This page lists Large Language Models that offer Multimodal Output. Compare models, see how they implement this feature, and find the best option for projects requiring robust Multimodal Output.

Providers

OpenAI

Models with this Capability

GPT-4o mini Realtime

OpenAI · GPT-4o

ID: gpt-4o-mini-realtime-preview

preview

Input

128K tokens

Output

4.1K tokens

Input Cost

$0.60/1M

Output Cost

$2.40/1M

Exceptional at:

realtime text processing
realtime audio processing

Similar Capabilities