Logo
Back to all models

Whisper - In-Depth Overview

OpenAI · Whisper

Current
Latest in family

Get all the details on Whisper, an AI model from OpenAI. This page covers its token limits, pricing structure, key capabilities such as language_identification, multilingual_speech_recognition, speech_to_text, available API code samples, and performance strengths.

Key Metrics

Input Limit

No data tokens

Output Limit

No data tokens

Input Cost

N/A/1M

Output Cost

N/A/1M

Sample API Code

from openai import OpenAI

client = OpenAI()

audio_file= open("audio.mp3", "rb")
transcription = client.audio.transcriptions.create(
  model="whisper-1", 
  file=audio_file
)

print(transcription.text)

Required Libraries

openai
openai

Notes

Whisper is a general-purpose speech recognition model, trained on a large dataset of diverse audio. You can also use it as a multitask model to perform multilingual speech recognition as well as speech translation and language identification.

Capabilities

language identification
multilingual speech recognition
speech to text
speech translation

Supported Data Types

Input Types

audio

Output Types

text

Strengths & Weaknesses

Good at

general speech recognition
multilingual transcription
speech translation
language identification

Additional Information

Latest Update

Mar 1, 2023

Knowledge Cutoff

No data