Back to all models
Get all the details on GPT-4o Transcribe, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as audio_transcription, language_recognition, speech_to_text, available API code samples, and performance strengths.
Key Metrics
Input Limit
16K tokens
Output Limit
2K tokens
Input Cost
$6.00/1M
Output Cost
$10.00/1M
Sample API Code
import openai
client = openai.OpenAI()
audio_file= open("/path/to/audio/file", "rb")
transcription = client.audio.transcriptions.create(
model="gpt-4o-transcribe",
file=audio_file
)
print(transcription.text)
Required Libraries
openai
openai
Notes
Speech-to-text model powered by GPT-4o. Offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.
Capabilities
audio transcription
language recognition
speech to text
Supported Data Types
Input Types
audio
text
Output Types
text
Strengths & Weaknesses
Exceptional at
speech to text accuracy
language recognition
Additional Information
Latest Update
Jun 1, 2024
Knowledge Cutoff
Jun 1, 2024