Logo
Back to all models

GPT-4o Audio Preview - In-Depth Overview

OpenAI · GPT-4o

preview

Model ID: gpt-4o-audio-preview-2024-12-17

Get all the details on GPT-4o Audio Preview, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as realtime_text, multimodal_input, audio_input, available API code samples, and performance strengths.

Key Metrics

Input Limit

No data tokens

Output Limit

No data tokens

Input Cost

$40.00/1M

Output Cost

$80.00/1M

Sample API Code

from openai import OpenAI
client = OpenAI()
speech_file_path = "speech.mp3"
response = client.audio.speech.create(
  model="gpt-4o-audio-preview-2024-12-17",
  voice="alloy",
  input="Hello, this is a test of the GPT-4o audio preview model."
)
response.stream_to_file(speech_file_path)

Required Libraries

openai
openai

Notes

GPT-4o model capable of audio inputs and outputs, supporting realtime text and audio interactions.

Supported Data Types

Input Types

audio
text

Output Types

audio
text

Strengths & Weaknesses

Exceptional at

realtime audio processing
realtime text processing

Good at

audio input
audio output
multimodal understanding

Additional Information

Latest Update

Dec 17, 2024

Knowledge Cutoff

No data