Back to all models
GPT-4o Audio Preview - In-Depth Overview
OpenAI · GPT-4o
preview
Model ID: gpt-4o-audio-preview-2024-12-17
Get all the details on GPT-4o Audio Preview, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as realtime_text, multimodal_input, audio_input, available API code samples, and performance strengths.
Key Metrics
Input Limit
No data tokens
Output Limit
No data tokens
Input Cost
$40.00/1M
Output Cost
$80.00/1M
Sample API Code
from openai import OpenAI
client = OpenAI()
speech_file_path = "speech.mp3"
response = client.audio.speech.create(
model="gpt-4o-audio-preview-2024-12-17",
voice="alloy",
input="Hello, this is a test of the GPT-4o audio preview model."
)
response.stream_to_file(speech_file_path)
Required Libraries
openai
openai
Notes
GPT-4o model capable of audio inputs and outputs, supporting realtime text and audio interactions.
Supported Data Types
Input Types
audio
text
Output Types
audio
text
Strengths & Weaknesses
Exceptional at
realtime audio processing
realtime text processing
Good at
audio input
audio output
multimodal understanding
Additional Information
Latest Update
Dec 17, 2024
Knowledge Cutoff
No data