Logo
Back to all models

GPT-4o Transcribe - In-Depth Overview

OpenAI · GPT-4o

Current
Latest in family

Get all the details on GPT-4o Transcribe, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as audio_transcription, language_recognition, speech_to_text, available API code samples, and performance strengths.

Key Metrics

Input Limit

16K tokens

Output Limit

2K tokens

Input Cost

$6.00/1M

Output Cost

$10.00/1M

Sample API Code

import openai
client = openai.OpenAI()
audio_file= open("/path/to/audio/file", "rb")
transcription = client.audio.transcriptions.create(
  model="gpt-4o-transcribe",
  file=audio_file
)
print(transcription.text)

Required Libraries

openai
openai

Notes

Speech-to-text model powered by GPT-4o. Offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Use it for more accurate transcripts.

Capabilities

audio transcription
language recognition
speech to text

Supported Data Types

Input Types

audio
text

Output Types

text

Strengths & Weaknesses

Exceptional at

speech to text accuracy
language recognition

Additional Information

Latest Update

Jun 1, 2024

Knowledge Cutoff

Jun 1, 2024