Whisper - In-Depth Overview

OpenAI · Whisper

Current

Latest in family

Model ID: whisper-1

Get all the details on Whisper, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as speech_recognition, multilingual_speech_recognition, speech_translation, available API code samples, and performance strengths.

Key Metrics

Input Limit

No data tokens

Output Limit

No data tokens

Input Cost

N/A/1M

Output Cost

N/A/1M

Sample API Code

import openai

client = openai.OpenAI()

audio_file = open("/path/to/audio.mp3", "rb")
transcript = client.audio.transcriptions.create(
  model="whisper-1",
  file=audio_file
)
print(transcript.text)

Required Libraries

openai

Notes

Whisper is a general-purpose speech recognition model, trained on a large dataset of diverse audio. It can also be used as a multitask model to perform multilingual speech recognition, speech translation, and language identification.