Back to all models
Get all the details on GPT-4o mini Audio, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as streaming, multimodal_input, function_calling, available API code samples, and performance strengths.
Key Metrics
Input Limit
128K tokens
Output Limit
16.4K tokens
Input Cost
$0.15/1M
Output Cost
$0.60/1M
Sample API Code
from openai import OpenAI
client = OpenAI()
# Example: Text-to-Speech
speech_file_path = "speech.mp3"
response = client.audio.speech.create(
model="gpt-4o-mini-audio-preview",
voice="alloy",
input="Hello, this is a test of the GPT-4o mini audio preview model."
)
response.stream_to_file(speech_file_path)
# Example: Speech-to-Text (Transcription)
audio_file= open("audio.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="gpt-4o-mini-audio-preview",
file=audio_file
)
print(transcript.text)
Required Libraries
openai
openai
Notes
A smaller model capable of audio inputs and outputs, designed to input audio or create audio outputs via the REST API. It is a preview release.
Capabilities
Supported Data Types
Input Types
text
audio
Output Types
text
audio
Strengths & Weaknesses
Exceptional at
audio input processing
audio output generation
Good at
realtime conversations
transcription
speech synthesis
Poor at
structured outputs
fine tuning
Additional Information
Latest Update
Dec 17, 2024
Knowledge Cutoff
Oct 1, 2023