Back to list
May 04, 2026
4
0
0
Other LLMNEW

Mistral Medium 3.5 Launches: 128B Open Model with 77.6% SWE-Bench and Cloud Coding Agents

Mistral AI releases Medium 3.5, a 128B dense open-weights model scoring 77.6% on SWE-Bench Verified, paired with Vibe remote cloud agents and Work mode for Le Chat.

#Mistral#Mistral Medium 3.5#open-source#SWE-Bench#coding agents
Mistral Medium 3.5 Launches: 128B Open Model with 77.6% SWE-Bench and Cloud Coding Agents
AI Summary

Mistral AI releases Medium 3.5, a 128B dense open-weights model scoring 77.6% on SWE-Bench Verified, paired with Vibe remote cloud agents and Work mode for Le Chat.

Introduction

On May 2, 2026, Mistral AI launched Mistral Medium 3.5 alongside a major upgrade to its Vibe coding platform, introducing cloud-based remote agents and a new Work mode for Le Chat. The 128B dense model ships as open weights under a modified MIT license, combining instruction-following, coding, and multimodal reasoning in a single unified set of weights — and posting an industry-notable 77.6% on SWE-Bench Verified.

Feature Overview

Mistral Medium 3.5 Model Architecture

Mistral Medium 3.5 is a dense 128B-parameter model with a 256,000-token context window (approximately 200,000 words). Unlike many competing multimodal models that repurpose pretrained CLIP encoders, Mistral built its vision encoder from scratch, allowing variable image size processing and tighter integration with the language backbone. The model is self-hostable on a minimum of four GPUs and is available on Hugging Face under open weights, making it immediately accessible to researchers and enterprises.

On coding benchmarks, Medium 3.5 achieves 77.6% on SWE-Bench Verified, outperforming both Devstral 2 and Qwen3.5 397B A17B on coding tasks. On the τ³-Telecom agentic benchmark it scores 91.4, demonstrating reliable multi-tool calling and structured output generation — capabilities critical for autonomous agentic pipelines.

Vibe Remote Agents

The Vibe coding platform now supports full cloud-based execution of long-running coding sessions. Developers can launch agents from the Mistral Vibe CLI or directly from Le Chat, with each session running in an isolated cloud sandbox. A key differentiator is session continuity: local development sessions can be "teleported" to the cloud without losing context, file diffs, or task history. When agents complete work, they can automatically open pull requests on GitHub, with visibility into every file change and tool call made during the session.

Vibe integrates natively with Linear, Jira, Sentry, Slack, and Microsoft Teams, enabling it to read tickets, file bugs, and post updates directly without manual handoffs.

Work Mode in Le Chat

Le Chat now includes a Work mode (Preview) powered by Mistral Medium 3.5. This agentic layer handles multi-step research, analysis, and cross-tool tasks — inbox triage, document synthesis, and parallel tool execution — while surfacing agent actions and requiring user approval gates before critical operations. It represents Mistral's entry into the broader productivity-agent market occupied by competitors such as Claude's Projects and ChatGPT Workspace Agents.

API Availability and Pricing

Mistral Medium 3.5 is available through the Mistral API at $1.50 per million input tokens and $7.50 per million output tokens. It is also accessible via NVIDIA Build and NVIDIA NIM containerized inference, as well as Hugging Face open weights for self-hosting.

Usability Analysis

For software engineering teams, the combination of a high SWE-Bench score and cloud-native asynchronous execution addresses a real pain point: running long-horizon coding tasks without tying up a local machine. The teleport-to-cloud feature lowers the barrier to adopting remote agents by preserving existing session state. The open-weights release also enables fine-tuning and deployment on private infrastructure, which will appeal strongly to enterprises with data residency requirements.

The Work mode in Le Chat positions Mistral as a more direct competitor to productivity-oriented AI assistants. However, as a preview feature it lacks the maturity of comparable offerings from Anthropic and OpenAI.

Pros and Cons

Pros:

  • 77.6% SWE-Bench Verified is among the highest scores for an open-weights model
  • True open-weights release with a permissive modified MIT license
  • Seamless local-to-cloud session teleportation for long-running agent tasks
  • Native integrations with GitHub, Linear, Jira, Sentry, Slack, and Teams
  • Self-hostable on four GPUs, enabling private deployment

Cons:

  • Work mode in Le Chat is still in Preview and may lack reliability for production workflows
  • 128B dense model requires significant hardware for self-hosted inference
  • API pricing ($7.50/M output tokens) is at a premium tier compared to smaller alternative models
  • Vibe CLI tooling is newer and less documented than competing platforms

Outlook

Mistral's dual release — a frontier open model plus a cloud agent platform — signals a maturing strategy beyond pure model research. By posting competitive SWE-Bench numbers with an open-weights model and wrapping it in production-grade tooling, Mistral is staking out a position as the developer-first alternative to closed-API incumbents. The teleport-to-cloud architecture in Vibe could become a meaningful workflow differentiator if the reliability holds up under production workloads. Expect rapid iteration on Work mode as Mistral closes the feature gap with ChatGPT Workspace Agents and Claude Projects.

Conclusion

Mistral Medium 3.5 and the Vibe remote agents update represent the most significant release from Mistral AI in 2026 so far. The combination of a high-performing open-weights model and asynchronous cloud coding agents addresses both research and enterprise use cases. Engineers who need a capable, self-hostable coding model or want to explore cloud-native agent workflows should evaluate this release closely.

Editor's Verdict

Mistral Medium 3.5 Launches: 128B Open Model with 77.6% SWE-Bench and Cloud Coding Agents earns a solid recommendation within the other llm space.

The strongest case for paying attention is top-tier SWE-Bench Verified score (77.6%) for an open-weights model, which raises the bar for what readers should now expect from peers in this space. Reinforcing that, true open release with permissive license enabling self-hosting and fine-tuning adds practical value rather than just headline appeal. The broader signal worth registering is straightforward: 77.6% SWE-Bench Verified places Mistral Medium 3.5 among the top open-weights coding models, competitive with closed-API alternatives. On the other side of the ledger, le Chat Work mode is in Preview and not yet production-ready is a real constraint, not a marketing footnote, and it should factor into any serious decision. Layered on top of that, 128B dense architecture requires substantial GPU resources for self-hosted inference narrows the set of teams for whom this is an obvious yes.

For multi-model deployment teams, cost-conscious operators, and developers willing to evaluate beyond the major labs, this is a serious evaluation candidate, not just a curiosity to bookmark. For everyone else, the safer posture is to monitor coverage and revisit once the use cases that matter to your team are demonstrated in the wild.

Pros

  • Top-tier SWE-Bench Verified score (77.6%) for an open-weights model
  • True open release with permissive license enabling self-hosting and fine-tuning
  • Innovative local-to-cloud session teleportation preserves development context
  • Deep integration ecosystem covering GitHub, project management tools, and messaging platforms
  • Custom vision encoder enables more flexible multimodal image processing

Cons

  • Le Chat Work mode is in Preview and not yet production-ready
  • 128B dense architecture requires substantial GPU resources for self-hosted inference
  • API output token pricing is relatively high compared to smaller open models
  • Vibe CLI documentation and ecosystem maturity lags behind established coding agent platforms

Comments0

Key Features

1. 128B dense model with 256k token context window, open-weights under modified MIT license 2. 77.6% SWE-Bench Verified score, outperforming Devstral 2 and Qwen3.5 397B A17B 3. Vibe remote agents with local-to-cloud session teleportation for async coding tasks 4. Native integrations: GitHub, Linear, Jira, Sentry, Slack, Microsoft Teams 5. Le Chat Work mode (Preview) for multi-step research and cross-tool agentic workflows 6. Custom-built vision encoder for variable image size multimodal processing 7. Available via Mistral API, NVIDIA Build/NIM, and Hugging Face open weights

Key Insights

  • 77.6% SWE-Bench Verified places Mistral Medium 3.5 among the top open-weights coding models, competitive with closed-API alternatives
  • The local-to-cloud teleportation feature for Vibe agents solves a real developer pain point: long-running tasks that outlast local machine availability
  • Open-weights release under modified MIT license enables fine-tuning and private deployment, a major advantage for data-sensitive enterprises
  • Building a custom vision encoder from scratch rather than reusing CLIP reflects Mistral's commitment to architectural innovation over integration shortcuts
  • Work mode in Le Chat marks Mistral's entry into the productivity-agent market, competing directly with ChatGPT Workspace Agents and Claude Projects
  • Self-hosting on four GPUs minimum makes the model accessible to mid-size organizations without requiring hyperscaler infrastructure
  • The $1.50/$7.50 per million token pricing positions Medium 3.5 between budget and premium tiers, targeting professional developer workflows

Was this review helpful?

Share

Twitter/X