GPT-4o Transcribe - In-Depth Overview

OpenAI · GPT-4o

Current

Latest in family

Model ID: gpt-4o-transcribe

Get all the details on GPT-4o Transcribe, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as speech_to_text, improved_accuracy, language_recognition, available API code samples, and performance strengths.

Key Metrics

Input Limit

16K tokens

Output Limit

2K tokens

Input Cost

$6.00/1M

Output Cost

$10.00/1M

Sample API Code

from openai import OpenAI
client = OpenAI()
audio_file = open("/path/to/audio.mp3", "rb")
transcript = client.audio.transcriptions.create(
  model="gpt-4o-transcribe",
  file=audio_file
)
print(transcript.text)

Required Libraries

openai

Notes

GPT-4o Transcribe is a speech-to-text model that uses GPT-4o to transcribe audio. It offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.