Mistral Vibe: Leanstral Integration for Formal Theorem Proving

Mistral Vibe

Mistral Vibe v2.5.0 introduced a dedicated theorem proving agent powered by Leanstral, the first open-source code agent designed for Lean 4 formal verification. Leanstral is a 120B sparse mixture-of-experts model activating only 6B parameters per token, yet outperforms open-source models many times its size on formal proof benchmarks β€” achieving a pass@2 score of 26.3 for $36 in compute, compared to Claude Sonnet 4.6's 23.7 score at $549. Mistral Vibe users activate the theorem proving agent via the /leanstall command, which bootstraps the full Lean 4 environment including lean-lsp-mcp integration.


Mistral Vibe Gains a Dedicated Theorem Proving Agent

Mistral Vibe v2.5.0 ships with native integration for Leanstral, Mistral's newly released open-source code agent built specifically for Lean 4 formal verification. The integration represents a significant expansion of Mistral Vibe's capabilities beyond general-purpose coding into verifiable, mathematically rigorous software development.

What Leanstral Does

Leanstral operates within the Lean 4 proof assistant environment, capable of writing Lean code, constructing formal proofs, diagnosing broken proofs, and navigating the tight feedback loop that Lean enforces through its type checker. Rather than generating code and hoping it is correct, Leanstral generates code alongside machine-checkable proofs of correctness. Developers specify requirements formally, Leanstral produces an implementation plus a proof that the implementation satisfies the specification, and Lean 4's type checker automatically verifies the proof. The result is a workflow where correctness is not a matter of test coverage or code review β€” it is mathematically guaranteed.

The model uses a sparse mixture-of-experts architecture with 120B total parameters but only 6B active parameters per token, giving it the inference cost profile of a small model with the capability of a much larger one. Mistral trained it specifically with tool-calling capabilities for lean-lsp-mcp, the Model Context Protocol server for Lean's language server. This means Leanstral can interact with Lean's compiler directly β€” checking types, running tactics, inspecting error messages, and iterating on proofs in real time within the actual Lean development environment.

Performance and Benchmarks

Mistral introduced FLTEval, a new evaluation suite designed to measure proof engineering capability in realistic repository settings rather than isolated competition problems. FLTEval measures an agent's ability to complete formal proofs and correctly define new mathematical concepts within real pull requests to the Fermat's Last Theorem (FLT) formalization project.

On this benchmark, Leanstral-120B demonstrates a striking efficiency advantage:

  • Pass@2: 26.3 score at $36 per task β€” outperforms Claude Sonnet 4.6 (23.7 score, $549) at 93% lower cost
  • Pass@4: 29.3 score β€” surpasses Qwen3.5-397B and Kimi-K2.5-1T, models with up to 150x more active parameters
  • Pass@16: 31.9 score at $290 β€” approaching Claude Opus 4.6's leading 39.6 score at a fraction of the price

Using Leanstral in Mistral Vibe

Access is zero-configuration for Mistral Vibe users. The /leanstall command bootstraps the full theorem proving environment, installing lean-lsp-mcp and configuring the agent for immediate proof work. Three deployment paths are available:

  • Mistral Vibe (recommended): Run /leanstall to get started immediately
  • Labs API: A free endpoint at labs-leanstral-2603 is available for community feedback and exploration
  • Self-hosted: Apache 2.0 licensed weights are available on Hugging Face for local deployment via vLLM

Open-Source Release

Leanstral ships fully open under the Apache 2.0 license. Mistral released the model weights, technical documentation, and the FLTEval benchmark suite, making the full evaluation pipeline reproducible by the research community.