Anthropic prepares Mythos 1 for Claude Code and Claude Security (2 minute read)
Anthropic appears to be moving Claude Mythos to broader availability, with the model now helping protect a wider range of organizations. Traces of the model have already surfaced on Google Cloud and AWS through vulnerability discovery programs. A release for Mythos 1 seems imminent. Claude Opus 4.8 is also rumored to be in the works for release.
|
|
Measuring LLMs' ability to develop exploits (17 minute read)
Mythos Preview is capable of turning vulnerabilities into exploit primitives and combining those primitives together into complete end-to-end attack chains. The knowledge and expertise required to develop exploits will drop significantly as the model's capabilities become more widely available. This article looks at the model's performance on ExploitBench and ExploitGym, two newer, more challenging academic benchmarks. Mythos Preview consistently outperforms all other evaluated models on these tests.
|
Exploring Agent-Assisted Qualitative Analysis (28 minute read)
Qualitative analysis is the process of reading a lot of messy, unstructured data and figuring out what is interesting, recurring, surprising, or important. One of the biggest research questions at the moment is, 'What is the right way to do agent-assisted qualitative analysis?' This post looks at the background of the field, proposes an experimental setup, and discusses how the field should move forward. It is still unclear whether AI agents can replace humans in qualitative analysis, but they can currently do the mechanical parts of qualitative analysis fast - just without taste.
|
The neocloud boom (7 minute read)
SpaceX bought xAI, took the two Colossus data centers Elon built, and started renting them to Anthropic for $15 billion a year. SpaceX did $18.7 billion in total revenue last year after 23 years of building rockets, so they almost doubled the company in a quarter by becoming a cloud provider. The bigger math is wilder, the AI buildout adds up to $7.5 trillion of spend over the next 4.5 years, about 5% of US GDP and on par with the 1880s railroad boom. Even if neoclouds only catch 20% of the $13.5 trillion of value that buildout creates, that's $2.5 trillion up for grabs because no handful of hyperscalers can build it all.
|
|
The 2026-07-28 MCP Specification Release Candidate (9 minute read)
The release candidate for the next Model Context Protocol (MCP) specification is now available. It is the largest revision of the protocol since launch. The candidate introduces a stateless core that scales on ordinary HTTP infrastructure, extensions, authorization that aligns more closely with OAuth and OpenID Connect deployments, a formal deprecation policy, and many other changes. It contains breaking changes. The final specification will ship on July 28.
|
Reasonix (Website)
Reasonix is a DeepSeek-native coding agent for the terminal. It is engineered around prefix-cache stability and designed to be left running. Token costs stay low across long sessions.
|
Lance (Hugging Face Repo)
Lance is a lightweight native unified multimodal model that supports image and video understanding, generation, and editing. It delivers strong performance across image generation, image editing, and video generation benchmarks with just 3B active parameters. The model was trained entirely from scratch within a 128-A100-GPU budget. Examples of clips generated by the model are available in the repository.
|
|
David Sacks's 11th-Hour Plea Led to Trump's Backtrack on AI Executive Order (9 minute read)
David Sacks, a venture Capitalist, warned President Trump on a call that the long-awaited executive order on the dangers posed by artificial intelligence that Trump was deliberating on could lead to mandatory regulations that slow down the industry in its race with Chinese competitors. Trump responded that he shared concerns about China and was worried about hindering AI investment. He then postponed the signing and told reporters he wouldn't sign the order. The incident shows how powerful Sacks' influence is and marks a win for those against strong guardrails to limit the risks posed by the technology.
|
Anthropic's march to profitability (3 minute read)
Anthropic is on track to do $10.9 billion in Q2 revenue, up from $4.8 billion in Q1, growing faster right now than Zoom did at the peak of the pandemic. The thing that flipped them to profit is compute getting cheaper, 71 cents of compute per revenue dollar in Q1, 56 cents in Q2. Claude Code on its own is at $2.5 billion in revenue, and the company expects to clear $559 million in profit just in time for its October IPO. The "AI labs burn money forever" story finally has a hole in it.
|
|
Gemini 3.5 Flash (Low) (1 minute read)
Gemini 3.5 Flash (Low) generates around 45% fewer tokens than Gemini 3.5 Flash (Medium) and generally outperforms Gemini 3.5 Flash (High) on SWE tasks.
|
|
|
|
|