Claude Code Raises Output Token Defaults: 64k for Opus 4.6, 128k Upper Bound for Opus and Sonnet

Claude CodeMar 16, 2026

Claude Code v2.1.77 raises the default maximum output token limit for Claude Opus 4.6 to 64k tokens, doubling the previous 32k cap that had been blocking long-form code generation workflows. The upper bound for both Opus 4.6 and Sonnet 4.6 is now 128k tokens, matching the raw API capability that had been inaccessible through the CLI. This resolves a long-standing pain point for multi-agent pipelines where truncated outputs forced developers to implement manual retry loops and environment variable workarounds.

Sources & Mentions

5 external resources covering this update

CLAUDE_CODE_MAX_OUTPUT_TOKENS has no effect on Opus 4.6 (capped at 32K) · Issue #29488

GitHub

CLAUDE_CODE_MAX_OUTPUT_TOKENS not applied to subagent (Task tool) API calls — hardcoded 32K limit · Issue #25569

GitHub

Claude Opus 4.6 Introduces Adaptive Reasoning and Context Compaction for Long-Running Agents

InfoQ

Claude Code Token Limits: A Guide for Engineering Leaders

Faros AI

Claude Code by Anthropic — Release Notes — March 2026 Latest Updates

Releasebot

Increased Output Token Limits for Opus 4.6 and Sonnet 4.6

Claude Code v2.1.77 delivers a long-awaited change for developers running heavy agentic workloads: the default maximum output token limit for Claude Opus 4.6 has been raised from 32k to 64k tokens, and the configurable upper bound for both Opus 4.6 and Sonnet 4.6 has been lifted to 128k tokens.

The Problem This Solves

Prior to this release, Claude Code enforced a hard 32k output token ceiling on Opus 4.6 — even when developers explicitly configured CLAUDE_CODE_MAX_OUTPUT_TOKENS=64000, the environment variable had no effect. This created a particularly painful failure mode in extended thinking mode: with the default MAX_THINKING_TOKENS set to 31,999, the available budget for actual code output shrank to under 1,000 tokens. Developers building multi-agent code generation pipelines were forced to implement --continue retry loops and orchestrator-level chunking to work around the limitation.

The issue was tracked across multiple GitHub reports, with users describing truncated source files and incomplete codebases as a "blocking issue" for production workflows. The CLAUDE_CODE_MAX_OUTPUT_TOKENS environment variable not being honoured for Opus 4.6 was particularly frustrating because the underlying API had always supported higher limits — the bottleneck was in the Claude Code CLI enforcement layer.

What Changes in v2.1.77

Anthropic raised two separate values in this release:

Default maximum for Opus 4.6: 64k tokens (previously 32k). This is the limit used for standard sessions without any explicit override — existing workflows benefit automatically.
Upper bound for both Opus 4.6 and Sonnet 4.6: 128k tokens. Developers can now set CLAUDE_CODE_MAX_OUTPUT_TOKENS up to this ceiling and have the setting respected.

Impact on Multi-Agent and Long-Form Workloads

For developers generating full codebases, large refactors, or complex documentation in a single agent turn, this change removes a primary source of truncated output. Sub-agents spawned via the Agent tool now benefit from the same higher default, meaning orchestrated pipelines no longer need to account for the 32k ceiling at the task level.

The higher upper bound is especially relevant in combination with extended thinking: with 128k available for total output, developers can allocate a generous thinking budget while still leaving substantial room for actual response content. The compounding problem — where the 31,999 default thinking token budget consumed almost the entire 32k output window — is now resolved by default.

Mentioned onGitHub GitHub InfoQ Faros AI Releasebot