Logo
Back to all models

GPT-4o mini Audio - In-Depth Overview

OpenAI · GPT-4o

preview

Get all the details on GPT-4o mini Audio, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as audio_processing, function_calling, multimodal_input, available API code samples, and performance strengths.

Key Metrics

Input Limit

128K tokens

Output Limit

16.4K tokens

Input Cost

$0.15/1M

Output Cost

$0.60/1M

Sample API Code

from openai import OpenAI
client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4o-mini-audio-preview",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What's in this audio file?"},
        {
          "type": "audio_file",
          "audio_file": {"file_id": "file-abc", "detail": "low"},
        },
      ],
    }
  ],
  max_tokens=300
)
print(response.choices[0].message.content)

Required Libraries

openai
openai

Notes

This is a preview release of the smaller GPT-4o Audio mini model. It's designed to input audio or create audio outputs via the REST API.

Capabilities

audio processing
function calling
multimodal input
multimodal output
streaming

Supported Data Types

Input Types

text
audio

Output Types

text
audio

Strengths & Weaknesses

Exceptional at

audio processing

Additional Information

Latest Update

Dec 17, 2024

Knowledge Cutoff

Oct 1, 2023