Back to all models
GPT-4o Transcribe - In-Depth Overview
OpenAI · GPT-4o
Current
Latest in family
Model ID: gpt-4o-transcribe
Get all the details on GPT-4o Transcribe, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as speech_to_text, improved_accuracy, language_recognition, available API code samples, and performance strengths.
Key Metrics
Input Limit
16K tokens
Output Limit
2K tokens
Input Cost
$6.00/1M
Output Cost
$10.00/1M
Sample API Code
from openai import OpenAI
client = OpenAI()
audio_file = open("/path/to/audio.mp3", "rb")
transcript = client.audio.transcriptions.create(
model="gpt-4o-transcribe",
file=audio_file
)
print(transcript.text)
Required Libraries
openai
openai
Notes
GPT-4o Transcribe is a speech-to-text model that uses GPT-4o to transcribe audio. It offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.
Supported Data Types
Input Types
audio
text
Output Types
text
Strengths & Weaknesses
Exceptional at
speech to text accuracy
language recognition
Additional Information
Latest Update
Jun 1, 2024
Knowledge Cutoff
Jun 1, 2024