AI Models with Audio Input Support

This page lists Large Language Models that offer Audio Input. Compare models, see how they implement this feature, and find the best option for projects requiring robust Audio Input.

Providers

OpenAI

Models with this Capability

GPT-4o mini Realtime

OpenAI · GPT-4o

ID: gpt-4o-mini-realtime

preview

Input

128K tokens

Output

4.1K tokens

Input Cost

$0.60/1M

Output Cost

$2.40/1M

Exceptional at:

realtime processing

audio input

Audio Input

Audio Output

Text Generation

GPT-4o Audio Preview

OpenAI · GPT-4o

ID: gpt-4o-audio-preview-2024-10-01

outdated

Input

0 tokens

Output

0 tokens

Input Cost

$40.00/1M

Output Cost

$80.00/1M

Exceptional at:

audio input processing

audio output generation

Advanced Reasoning

Function Calling

Audio Input

GPT-4o Audio Preview

OpenAI · GPT-4o

ID: gpt-4o-audio-preview-2024-12-17

preview

Input

0 tokens

Output

0 tokens

Input Cost

$40.00/1M

Output Cost

$80.00/1M

Exceptional at:

realtime audio processing

realtime text processing

Realtime Text

Multimodal Input

Audio Input

GPT-4o mini Realtime

OpenAI · GPT-4o

ID: gpt-4o-mini-realtime-preview-2024-12-17

preview

Input

128K tokens

Output

4.1K tokens

Input Cost

$0.60/1M

Output Cost

$2.40/1M

Exceptional at:

realtime processing

audio input

Multimodal Input

Realtime Processing

Audio Input

AI Models with Audio Input Support

Providers

Models with this Capability

GPT-4o mini Realtime

GPT-4o Audio Preview

GPT-4o Audio Preview

GPT-4o mini Realtime

Similar Capabilities