GPT-5.4 Mini Arrives in Codex: Faster Coding at One-Third the Cost

CodexMar 17, 2026

OpenAI introduced GPT-5.4 mini to Codex on March 17, 2026, a fast and efficient model that improves over GPT-5 mini across coding, reasoning, image understanding, and tool use while running more than 2x faster. In Codex, GPT-5.4 mini consumes just 30% of the GPT-5.4 quota, meaning comparable tasks run approximately 3.3x longer within included limits. The model is available across the Codex app, CLI, IDE extension, and the API, and is recommended for codebase exploration, large-file review, processing supporting documents, and less reasoning-intensive subagent work.

Sources & Mentions

5 external resources covering this update

GPT-5.4 mini and nano | Hacker News

Hacker News

OpenAI releases GPT-5.4 mini and nano, its most capable small models yet

9to5Mac

GPT-5.4 mini and nano

Simon Willison

OpenAI introduces GPT-5.4 mini and nano, fast and efficient models for high-volume work

Neowin

GPT-5.4 Nano, Mini

The New Stack

GPT-5.4 Mini Comes to Codex

OpenAI introduced GPT-5.4 mini to Codex on March 17, 2026, bringing a fast and capable model to every surface where Codex is available — the desktop app, CLI, IDE extension, web interface, and the API. The release positions GPT-5.4 mini as the recommended choice for high-throughput and parallel workloads within Codex, complementing GPT-5.4 for tasks that demand deeper reasoning.

Quota Efficiency: 3.3x More Work Within Included Limits

One of the most immediately practical aspects of GPT-5.4 mini in Codex is its quota consumption. The model uses just 30% of the GPT-5.4 quota, which translates to approximately 3.3x more tasks within a developer's included Codex limits. For teams running continuous background agents or processing large volumes of files, this efficiency gain meaningfully extends how much work can be done without hitting quota ceilings.

At the API level, GPT-5.4 mini is priced at $0.75 per million input tokens and $4.50 per million output tokens — roughly one-third the cost of GPT-5.4. This pricing opens the door to large-scale batch processing pipelines that would previously have been cost-prohibitive.

Performance: 2x Faster Than GPT-5 Mini

GPT-5.4 mini runs more than 2x faster than GPT-5 mini, making it the practical default for workflows where response latency matters. The speed improvement is consistent across the capabilities that matter most in a coding context: code generation, multi-step reasoning, image understanding, and tool use. For agentic workloads where many subtasks execute in sequence or in parallel, the throughput difference compounds quickly.

Recommended Use Cases in Codex

OpenAI recommends GPT-5.4 mini for the following use cases within Codex:

Codebase exploration — navigating large repositories, understanding project structure, and answering questions about existing code
Large-file review — reading and summarizing lengthy source files or configuration documents without the overhead of a heavier model
Processing supporting documents — ingesting changelogs, specifications, or reference materials that inform but don't require deep synthesis
Less reasoning-intensive subagent work — executing narrow, well-defined subtasks within a multi-agent pipeline

Subagent Architecture

The recommended deployment pattern for GPT-5.4 mini in Codex is as a subagent model: a larger, more capable model (such as GPT-5.4) handles orchestration and high-reasoning tasks, while GPT-5.4 mini handles the parallel leaf-node work — file reads, search queries, document summaries, and targeted code edits. This architecture maximizes both quality and throughput, using the heavier model only where it is strictly necessary.

The model ships with a 400,000-token context window, large enough to process an entire codebase or a batch of supporting documents in a single pass without chunking. This removes a common source of complexity in agentic pipelines that previously had to manage context window limits manually.

Benchmarks and Capability Improvements

GPT-5.4 mini improves over GPT-5 mini across all major capability benchmarks relevant to development work: coding, reasoning, image understanding, and tool use. The improvements are incremental rather than transformative — this is a refinement of the mini-class model, not a generational leap — but the cumulative effect is a model that handles a broader range of Codex tasks reliably without needing to escalate to GPT-5.4.

Availability

GPT-5.4 mini is available immediately across all Codex surfaces: the Codex desktop app, Codex CLI, the IDE extension, the web interface, and the Codex API. Developers accessing Codex through the API can specify the model directly; those using the app or CLI can select it from the model picker.

Mentioned onHacker News 9to5Mac Simon Willison Neowin The New Stack