Claude Code fallbackModel: Automatic Model Failover When Your Primary Is Overloaded

Claude CodeView original changelog

Claude Code 2.1.166 introduces a fallbackModel setting that lets developers configure a chain of up to three backup models, automatically tried in sequence when the primary model is overloaded or unavailable. The feature applies to both interactive sessions and backgrounded workers, so long-running agents no longer hard-fail during peak load. The update also adds full support for disabling thinking on models that reason by default, giving teams finer control over cost and latency.

Featured Video

A video we selected to help illustrate this changelog


New Features

Automatic Model Failover with fallbackModel

Claude Code 2.1.166 ships the fallbackModel configuration setting, which allows developers to specify a prioritized chain of up to three alternative models that Claude Code will try in sequence whenever the configured primary model is overloaded or unavailable. This brings meaningful resilience to workflows that previously had a single point of failure: a 529 overload response from the API would surface directly to the user or halt an automated pipeline entirely.

The failover chain works across the full range of Claude Code usage modes. For interactive terminal sessions, the --fallback-model flag now applies, and Claude Code will switch to the configured fallback for the remainder of the session when the primary model returns a "model not found" error. For background agents, the change is equally important: sessions launched with --bg or detached with the left-arrow shortcut now preserve the --fallback-model configuration, so workers running overnight or in CI environments degrade gracefully to a backup model instead of hard-failing when the API is under load.

Separately, Claude Code now retries a failing turn once on the fallback model when the API returns an unexpected non-retryable error. Auth failures, rate-limit errors, request-size violations, and transport errors still surface immediately without a retry, preserving fast feedback for conditions the developer must explicitly fix.

The context for this release is direct. Anthropic's services experienced elevated error rates on June 2 and June 5, 2026, affecting claude.ai, the Claude API, Claude Code, and Claude Cowork. Teams running Claude Code in automated workflows had no graceful degradation path at that point. The fallbackModel setting is the intended answer.

Thinking Disable Controls

Claude Code 2.1.166 also consolidates and hardens the controls for disabling extended thinking on models that think by default. Setting MAX_THINKING_TOKENS=0, passing --thinking disabled, or using the per-model thinking toggle in session settings now reliably suppresses thinking output on all applicable models. Previously, these controls had inconsistent coverage depending on which model was active. Teams optimizing for response speed or cost by turning thinking off can now do so with confidence across their entire model configuration, including fallback models.

Quality-of-Life Updates

The claude update command now announces the target version before downloading begins, rather than going silent until the download completes. Users who previously canceled an update mid-way due to no feedback now have a clear confirmation before the download starts.

In the claude agents view, typing a URL into the session list now filters to sessions whose first prompt contained that URL, making it faster to locate a specific background agent started from a known task.

Bug Fixes

This release resolves a collection of reliability issues, including: a recurring "image could not be processed" error that also consumed extra tokens; remote sessions becoming permanently stuck after brief backend disruptions during worker registration; flickering in JetBrains IDE terminals on the 2026.1+ release line; Shift+non-ASCII characters being dropped on terminals using the Kitty keyboard protocol; orphaned --bg-pty-host processes spinning at 100% CPU on macOS after the daemon exited; and claude agents background sessions crash-looping with "No conversation found" when reopened. The /doctor command no longer shows a contradictory failed check when run inside a remote session.