Replit: Audio Models Added to AI Integrations
Replit added four GPT-4o audio models to its AI Integrations platform, enabling developers to build speech-to-text and text-to-speech applications without setting up separate API keys. The two transcription models (gpt-4o-transcribe and gpt-4o-transcribe-mini) handle speech-to-text conversion, while two audio output models (gpt-4o-audio and gpt-4o-audio-mini) generate spoken content. Use cases include podcast transcription tools, voice assistants, and speech-enabled app interfaces.
Sources & Mentions
2 external resources covering this update
Four GPT-4o Audio Models Now Available
Replit's January 23, 2026 update brought audio capabilities to its AI Integrations platform, adding four OpenAI GPT-4o audio models that developers can use immediately without managing separate API credentials. The addition extends Replit's zero-setup AI tooling into the voice and audio space β an area of growing developer interest following widespread adoption of voice-enabled AI assistants.
The Four Models
The release adds two input models and two output models:
- gpt-4o-transcribe β Full-quality speech-to-text transcription
- gpt-4o-transcribe-mini β A smaller, faster transcription model for latency-sensitive applications
- gpt-4o-audio β Full audio generation for text-to-speech use cases
- gpt-4o-audio-mini β A compact audio output model optimized for performance
Developers can access all four through the same AI Integrations interface that already provides access to over 300 models, meaning no new accounts, no API key configuration, and no infrastructure setup.
What Developers Can Build
The transcription models open up use cases like podcast and meeting transcription tools, audio file processors, and accessibility features for existing apps. The audio output models support voice assistant interfaces, text-to-speech narration engines, and speech-enabled interaction layers for web and mobile applications.
For vibe coding specifically, this is notable: a developer can now describe a podcast transcription app or a voice-controlled to-do list in natural language, and Replit's Agent can scaffold the entire application β including the audio API calls β without the developer needing to know the details of the OpenAI audio API.
Also in the January 23 Release
The same release also added web search capabilities to Agent Automations (allowing scheduled agents to search the web natively without separate API keys) and added RulesSync support for replit.md configuration files, enabling synchronization of AI agent instructions across multiple projects.