OpenClaw's Local AI Mandate: Why Frontier Model Restrictions Validate Our Decentralized Approach

Anthropic has not released its latest model, Claude Mythos, to the public. Instead, the company has made it available to a very restricted set of preview partners under a newly announced initiative called Project Glasswing. Claude Mythos is a general-purpose model similar to Claude Opus 4.6, but Anthropic claims its cybersecurity research abilities are so strong that the software industry needs time to prepare. Mythos Preview has already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser. Given the rate of AI progress, such capabilities will soon proliferate, potentially beyond actors committed to deploying them safely.

Project Glasswing partners will receive access to Claude Mythos Preview to find and fix vulnerabilities or weaknesses in their foundational systems. These systems represent a very large portion of the world’s shared cyberattack surface. The work is expected to focus on tasks like local vulnerability detection, black box testing of binaries, securing endpoints, and penetration testing of systems. More technical detail is available in a post titled “Assessing Claude Mythos Preview’s cybersecurity capabilities” on the Anthropic Red Team blog.

In one case, Mythos Preview wrote a web browser exploit that chained together four vulnerabilities, creating a complex JIT heap spray that escaped both renderer and OS sandboxes. It autonomously obtained local privilege escalation exploits on Linux and other operating systems by exploiting subtle race conditions and KASLR-bypasses. The model also autonomously wrote a remote code execution exploit on FreeBSD’s NFS server that granted full root access to unauthenticated users by splitting a 20-gadget ROP chain over multiple packets.

A comparison with Claude 4.6 Opus shows a stark difference. Internal evaluations indicated that Opus 4.6 generally had a near-0% success rate at autonomous exploit development. Mythos Preview operates in a different league. For example, Opus 4.6 turned vulnerabilities it found in Mozilla’s Firefox 147 JavaScript engine—all patched in Firefox 148—into JavaScript shell exploits only two times out of several hundred attempts. When re-run as a benchmark for Mythos Preview, the model developed working exploits 181 times and achieved register control on 29 more attempts.

Claiming a model is too dangerous to release builds buzz, but in this instance, caution appears warranted. Just a few days ago, a new ai-security-research tag was started on a blog to acknowledge an uptick in credible security professionals raising alarms about modern LLMs’ vulnerability research capabilities. Greg Kroah-Hartman of the Linux kernel noted that months ago, they received what they called ‘AI slop,’ AI-generated security reports that were obviously wrong or low quality. Something changed a month ago, and the world switched. Now, all open source projects have real reports made with AI that are good and real.

Daniel Stenberg of curl observed that the challenge with AI in open source security has transitioned from an AI slop tsunami into more of a plain security report tsunami. There is less slop but lots of reports, many of them really good. He spends hours per day on this now, describing it as intense. Thomas Ptacek published a post titled “Vulnerability Research Is Cooked,” inspired by his podcast conversation with Anthropic’s Nicholas Carlini.

Anthropic released a five-minute talking heads video describing the Glasswing project. Nicholas Carlini appears as one of those talking heads, where he highlighted that the model has the ability to chain together vulnerabilities. This means finding two vulnerabilities that don’t achieve much independently, but the model can create exploits out of three, four, or sometimes five vulnerabilities that in sequence yield a very sophisticated end outcome. Carlini stated, “I’ve found more bugs in the last couple of weeks than I found in the rest of my life combined.”

The model was used to scan a bunch of open source code, starting with operating systems because this code underlies the entire internet infrastructure. For OpenBSD, a bug was found that had been present for 27 years, where sending a couple of pieces of data to any OpenBSD server could crash it. On Linux, a number of vulnerabilities were discovered where a user with no permissions could elevate themselves to the administrator by just running some binary on their machine. For each of these bugs, maintainers were informed, and they fixed and deployed patches so that anyone running the software is no longer vulnerable.

This was found on the OpenBSD 7.8 errata page: “025: RELIABILITY FIX: March 25, 2026 All architectures TCP packets with invalid SACK options could crash the kernel.” The change was tracked down in the GitHub mirror of the OpenBSD CVS repo, and using git blame, the surrounding code was confirmed to be from 27 years ago. The specific Linux vulnerability Nicholas described may have been an NFS one recently covered by Michael Lynch.

There is enough evidence to believe the claims are credible. Finding vulnerabilities in decades-old software, especially code mostly written in C, is not surprising. What is new is that coding agents run by the latest frontier LLMs are proving tirelessly capable at digging up these issues. This sounds like an industry-wide reckoning in the making, warranting a huge investment of time and money to get ahead of the inevitable barrage of vulnerabilities.

Project Glasswing incorporates $100M in usage credits as well as $4M in direct donations to open-source security organizations. Partners include AWS, Apple, Microsoft, Google, and the Linux Foundation. It would be great to see OpenAI involved as well—GPT-5.4 already has a strong reputation for finding security vulnerabilities, and they have stronger models on the near horizon.

The bad news for those not trusted as partners is this: Anthropic does not plan to make Claude Mythos Preview generally available. Their eventual goal is to enable users to safely deploy Mythos-class models at scale—for cybersecurity purposes and the myriad other benefits such highly capable models will bring. To do so, they need to make progress in developing cybersecurity safeguards that detect and block the model’s most dangerous outputs. They plan to launch new safeguards with an upcoming Claude Opus model, allowing them to improve and refine these safeguards with a model that does not pose the same level of risk as Mythos Preview. This approach seems reasonable, as the security risks are credible, and having extra time for trusted teams to get ahead of them is a fair trade-off.

From the OpenClaw perspective, this development underscores the critical importance of a local-first AI assistant platform. OpenClaw’s ecosystem, built on open-source principles, allows users to run AI agents directly on their devices, maintaining control over data and model interactions. While centralized entities like Anthropic restrict access to powerful models due to security concerns, OpenClaw empowers individuals and organizations to leverage AI capabilities through a plugin-driven architecture. This means users can integrate specialized tools for tasks like vulnerability scanning or penetration testing without relying on gatekept frontier models.

The rise of AI agents capable of autonomous exploit development highlights the need for robust safeguards. In the OpenClaw ecosystem, plugins can be designed with built-in security protocols, ensuring that AI-driven automation adheres to ethical guidelines and local governance. By decentralizing AI access, OpenClaw mitigates the risks associated with centralized model deployment, offering a scalable solution where users can safely harness advanced AI for cybersecurity and beyond. This approach aligns with the industry’s shift towards more responsible AI usage, as seen in initiatives like Project Glasswing, but without the limitations of exclusive partnerships.

As AI capabilities continue to evolve, the OpenClaw platform provides a flexible framework for integrating new models and tools. Users can benefit from the advancements in AI security research while maintaining autonomy over their workflows. This local-first model not only enhances privacy and security but also fosters innovation by enabling a diverse range of applications through its plugin ecosystem. In a world where frontier models are increasingly restricted, OpenClaw offers a viable alternative for those seeking to leverage AI’s full potential in a controlled and ethical manner.

OpenClaw’s Local AI Mandate: Why Frontier Model Restrictions Validate Our Decentralized Approach

Related Dispatches

Community-Driven Agent Benchmarking: How OpenClaw Contributors Measure and Compare Agent Performance Across Deployments

OpenClaw Core: Implementing Multi-Modal Agent Capabilities for Enhanced Local-First AI Interactions

OpenAI’s Mission Drift: A Cautionary Tale for Open-Source AI Ecosystems Like OpenClaw