Minstral

Mistral Releases Leanstral, First Open-Source Lean 4 Code Agent

Leanstral targets formal proof verification in software and mathematics, positioning itself as a cost-efficient alternative to closed-source competitors.

Leanstral targets formal proof verification in software and mathematics, positioning itself as a cost-efficient alternative to closed-source competitors.

Mistral AI released Leanstral on March 16, 2026. The first open-source AI agent built for Lean 4, a proof assistant used in mathematical research and software specification. The model is designed to construct and verify formal proofs in real-world code repositories, not isolated problems.

Leanstral uses a Mixture-of-Experts architecture with 6 billion active parameters. Mistral said the design optimises the model for proof engineering while keeping inference costs low. The company published the weights under an Apache 2.0 license and made the model available via a free API endpoint (labs-leanstral-2603) and through agent mode in Mistral Vibe, its coding environment. 

The release addresses a constraint Mistral’s researchers describe as the primary bottleneck in AI-assisted engineering. The cost and expertise required for humans to manually verify machine-generated code. The company said it intends subsequent generations of coding agents to formally prove their own implementations against declared specifications, reducing dependence on manual review.

ALSO READ: Why Europe Leads AI Regulation, Lags in AI Power

To benchmark performance, Mistral introduced FLTEval, an evaluation suite that measures an agent’s ability to complete formal proofs and define new mathematical concepts within pull requests to the Fermat’s Last Theorem project in Lean 4. The suite moves beyond competition mathematics, which Mistral said has dominated prior evaluations. Mistral confirmed the benchmark details in the official blog post linked above.

The cost comparison against Anthropic’s Claude family is where Mistral’s case for the model is clearest. Leanstral costs $36 to run and scores 26.3, compared with Claude Sonnet 4.6 at $549 for a score of 23.7. Claude Opus 4.6 scores highest at 39.6 but costs $1,650 per run, 92 times the cost of a single Leanstral pass, per the benchmark table Mistral published. It is unclear whether Mistral’s cost figures reflect API pricing at the time of evaluation or projected rates.

In one published case study, Mistral fed Leanstral a question from the Proof Assistants Stack Exchange about a script that stopped compiling in Lean 4.29.0-rc6, a release the model had not been trained on. Leanstral identified that a def declaration was blocking a rewrite tactic and proposed replacing it with abbrev. The model had not seen that version of Lean during training. In a second case study, the model converted program definitions from the Rocq proof assistant into Lean 4 and proved properties about those programs from Rocq statements alone, without access to the original proofs.

Mistral said it plans to release a technical report detailing Leanstral’s training approach alongside FLTEval. A timeline for the report was not included in the announcement.

The release continues Mistral’s pattern of open-weight releases paired with commercial API access. The company released Devstral 2, a code-focused model, earlier in 2026, and has expanded Mistral Vibe as its primary developer interface. Leanstral integrates directly into Vibe via the /leanstral command.

Avatar photo
NN Desk

Leave a Reply

Your email address will not be published. Required fields are marked *

Stay updated with NervNow Weekly

Subscribe now