Back to all models
Get all the details on Whisper, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as language_identification, multilingual_speech_recognition, speech_to_text, available API code samples, and performance strengths.
Key Metrics
Input Limit
No data tokens
Output Limit
No data tokens
Input Cost
N/A/1M
Output Cost
N/A/1M
Sample API Code
from openai import OpenAI
client = OpenAI()
audio_file= open("audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
print(transcription.text)
Required Libraries
openai
openai
Notes
Whisper is a general-purpose speech recognition model, trained on a large dataset of diverse audio. You can also use it as a multitask model to perform multilingual speech recognition as well as speech translation and language identification.
Capabilities
language identification
multilingual speech recognition
speech to text
speech translation
Supported Data Types
Input Types
audio
Output Types
text
Strengths & Weaknesses
Good at
general speech recognition
multilingual transcription
speech translation
language identification
Additional Information
Latest Update
Mar 1, 2023
Knowledge Cutoff
No data