GitHub Copilot CLI: Rubber Duck Now Supports Cross-Family Model Pairing

GitHub Copilot

GitHub Copilot CLI's Rubber Duck cross-family review agent has expanded its model support, enabling both GPT-based and Claude-based orchestrator sessions to benefit from second-opinion code review. Previously limited in which combinations were available, GPT-session users can now activate a Claude-powered Rubber Duck reviewer, while Claude-session users are now paired with GPT-5.5 as the critic model. The feature is accessible via /experimental in the Copilot CLI and automatically triggers at key checkpoints β€” after planning, complex implementation, and test writing.

Featured Video

A video we selected to help illustrate this changelog


GitHub Copilot CLI's Rubber Duck Agent Now Works Across More Model Combinations

GitHub has expanded the Rubber Duck feature in GitHub Copilot CLI, extending cross-family model review support to a broader set of session configurations. The update ensures that developers using any major model family as their primary orchestrator can now take advantage of a second-opinion reviewer drawn from a complementary AI family.

What Rubber Duck Does

Rubber Duck is a specialized review agent built into GitHub Copilot CLI that runs on a different AI model family than the one driving the primary session. The core insight behind its design is that a single model reviewing its own output remains constrained by the same training patterns and blind spots that produced the code in the first place. By bringing in a critic from a different family β€” with distinct training data, reasoning patterns, and architecture β€” Rubber Duck surfaces issues the orchestrating model might systematically overlook.

The agent focuses on a short, high-value list of concerns: architectural decisions worth reconsidering, subtle bugs, cross-file conflicts, and edge cases. It does not attempt a full re-review of every line β€” it zero-in on the highest-leverage feedback.

Expanded Model Pairings

With the May 7 update, the Rubber Duck feature now supports the following combinations:

  • GPT-orchestrated sessions: A Claude-powered Rubber Duck agent is dispatched to review plans and implementations, providing architectural insights and catching issues a GPT model might miss.
  • Claude-orchestrated sessions: Rubber Duck now pairs with GPT-5.5 (previously GPT-5.4) as the critic, offering enhanced feedback quality from OpenAI's latest generation model.

This bidirectional support means that regardless of which model family a developer prefers as their primary driver, a complementary reviewer is available.

When It Activates

Rubber Duck can be triggered manually at any time by the developer, but GitHub Copilot also invokes it automatically at three checkpoints where early feedback delivers the highest return:

  1. After drafting a plan β€” catching a flawed approach before implementation avoids compounding errors across dozens of steps.
  2. After a complex implementation β€” a second perspective on multi-file changes can surface edge cases and subtle logic bugs.
  3. After writing tests, before execution β€” identifying gaps in test coverage or flawed assertions before running a test suite saves significant debugging time.

Performance Benchmarks

Evaluations on SWE-Bench Pro demonstrated that Claude Sonnet paired with Rubber Duck (running GPT-5.4) closed 74.7% of the performance gap between Sonnet and Opus running alone. In real-world complex tasks spanning 70+ steps and multiple files, the combination identified bugs β€” including an infinite loop in a scheduler, a dictionary overwrite in search queries, and Redis key conflicts across services β€” that the primary agent had missed.

How to Access

Rubber Duck remains in experimental mode in GitHub Copilot CLI. Developers can enable it by running the /experimental slash command within a Copilot CLI session.