GitHub Copilot CLI: BYOK and Local Model Support

GitHub CopilotApr 7, 2026

GitHub Copilot CLI now supports BYOK and local models like Ollama and vLLM, enabling offline air-gapped workflows without GitHub authentication.

Sources & Mentions

4 external resources covering this update

GitHub Docs

GitHub

GitHub Issue #1219

GitHub

GitHub Issue #1170

GitHub

Microsoft Tech Community

GitHub Copilot CLI Gains BYOK and Local Model Support

GitHub has significantly expanded the flexibility of the Copilot CLI by introducing support for Bring Your Own Key (BYOK) and fully local model deployments. Previously, the CLI required routing all requests through GitHub's hosted model infrastructure. With this update, developers can now plug in their own model provider or run models entirely on their own hardware.

Connecting Any Model Provider

The Copilot CLI can now be configured to connect to Azure OpenAI, Anthropic, or any endpoint that implements the OpenAI Chat Completions API. Configuration is done through environment variables set before launching the CLI, making it straightforward to integrate with existing provider accounts. This extends to locally running inference servers as well — Ollama, vLLM, and Microsoft Foundry Local are all supported out of the box.

Fully Offline and Air-Gapped Workflows

A new COPILOT_OFFLINE=true environment variable instructs the CLI to avoid contacting GitHub's servers entirely. In this mode, all telemetry is disabled and the CLI communicates exclusively with the configured local or remote provider. Combined with a locally hosted model, this enables fully air-gapped development environments — a capability long requested by teams in security-sensitive or regulated industries.

Optional GitHub Authentication

When a custom model provider is configured, GitHub authentication is no longer required to use the CLI. Developers can start working immediately with just their provider credentials. Signing in to GitHub remains optional and unlocks additional capabilities such as the /delegate command, GitHub Code Search integration, and access to MCP servers.

Model Requirements

Not all models are compatible: the CLI requires models that support both tool calling (function calling) and streaming. GitHub recommends a context window of at least 128k tokens for the best experience. Built-in sub-agents automatically inherit the provider configuration, and invalid or unsupported provider settings produce actionable error messages rather than silent failures. Setup instructions are accessible directly from the terminal by running copilot help providers.

Mentioned onGitHub GitHub GitHub Microsoft Tech Community