AI Models with Audio Support

This page lists Large Language Models that offer Audio. Compare models, see how they implement this feature, and find the best option for projects requiring robust Audio.

Providers

Google

Models with this Capability

Gemini 1.5 Flash-8B

Google · Gemini

ID: google-gemini-1.5-flash-8b

Current

Input

1M tokens

Output

0 tokens

Input Cost

$0.04/1M

Output Cost

$0.15/1M

Multimodal Input

+4

Similar Capabilities