OpenAI offers new voice models that reason, translate and transcribe as you speak

OpenAI has just released three new real-time voice models that it says will “unlock a new class of voice applications for developers.” Each new voice intelligence model has a unique specialty for different purposes.

Developers can create new app experiences with OpenAI’s 3 new voice models

There are three new OpenAI voice models aimed at different purposes, including reasoning, translation, and transcription.

Here’s what the company announced today:

GPT‑real‑2our first voice model with GPT-5 class reasoning capable of handling more difficult requests and moving the conversation forward naturally.
GPT‑Realtime‑Translationa new live translation model that translates speech from over 70 input languages to 13 output languages while following the speaker’s pace.
GPT‑Realtime‑Whispera new streaming text-to-speech that transcribes speech live as the speaker speaks.

OpenAI explains in more detail what’s new in the GPT-5 class GPT-Realtime-2 voice model with reasoning:

GPT‑Realtime‑2 is designed for live voice interactions where the model keeps the conversation moving while it reasons about a request, invokes tools, handles corrections or interruptions, and responds in a way appropriate to the moment.

Meanwhile, the new voice translation model supports “70 input languages and 13 output languages,” the company says.

Finally, there is the real-time transcription model:

GPT‑Realtime‑Whisper is a new streaming transcription model designed for low-latency text-to-speech. It transcribes audio as people speak, so live products can sound faster, more responsive, and more natural, from the captions that appear in the moment to the meeting notes that follow the conversation.

The three new voice models are included in OpenAI’s Realtime API, the company says, with this price:

GPT‑Realtime‑2 is priced at $32/1 million audio input tokens ($0.40 for cached input tokens) and $64/1 million audio output tokens.
GPT‑Realtime‑Translate costs $0.034 per minute.
GPT‑Realtime‑Whisper is priced at $0.017 per minute.

You can test the new voice models in real time in the Playground⁠. If Codex is installed, click Submit at the prompt below to add GPT‑Realtime‑2 to your existing application or create a new application with it.

You can learn more about OpenAI’s latest voice models and how businesses are already using the new technology here.

FTC: We use automatic, revenue-generating affiliate links. More.

OpenAI offers new voice models that reason, translate and transcribe as you speak

Developers can create new app experiences with OpenAI’s 3 new voice models

5 Apple Products You Can Skip After Price Hike

Apple Raises Prices on Most Products, But Your iPhone Still Costs the Same (For Now)

Recent YouTube TV update causes unexpected issues for Roku users

OpenAI offers new voice models that reason, translate and transcribe as you speak

Developers can create new app experiences with OpenAI’s 3 new voice models

Recommended

5 Apple Products You Can Skip After Price Hike

Apple Raises Prices on Most Products, But Your iPhone Still Costs the Same (For Now)

Recent YouTube TV update causes unexpected issues for Roku users