Anthropic Raised $65B in Series H Funding (2 minute read)
Anthropic announced a $65 billion Series H round at a $965 billion post-money valuation, citing strong enterprise adoption, $47 billion in run-rate revenue, and plans to expand compute capacity, research, and product development.
|
Opus 4.8 (4 minute read)
Anthropic released Claude Opus 4.8 with benchmark improvements, adjustable effort controls, dynamic workflows in Claude Code, and a faster mode that became significantly cheaper.
|
How long is Anthropic's lease with SpaceX? Opinions vary (3 minute read)
SpaceX earlier this month signed a major compute deal with Anthropic worth billions of dollars a month. However, Elon Musk recently downplayed the deal, saying that SpaceX had not committed to leasing its compute for years, even though it is possible that might happen. The agreement is actually a 180-day lease with a 90-day mutual cancellation thereafter. The short-term agreement was SpaceX's request as it may want the compute back at some point. Musk's statement directly contradicts SpaceX's S-1 filing, which presents the deal as a three-year agreement.
|
|
Agent Judge: Solving Long-Context Evals for Production Agents (10 minute read)
Agent Judge improves evaluations for long-context, production agents by focusing on Search, Verification, and Adaptation. It tackles LLM judges' shortcomings by navigating long trajectories, verifying stateful actions against systems, and updating rubrics based on real feedback. Test results show Agent Judge, especially with refined rubrics, surpasses traditional LLM judges in accuracy and consistency, particularly in challenging scenarios.
|
How far behind are open models? (17 minute read)
Open models are generally not as capable as the best closed models. However, they aren't too far behind, with tests showing that they are only four to six months behind on public benchmarks. The gap was the smallest around the time of DeepSeek R1. It has since been growing.
|
|
Introducing dynamic workflows (3 minute read)
Jarred Sumner used dynamic workflows to rewrite Bun from Zig to Rust, achieving 99.8% test suite success with 750,000 lines of Rust in 11 days. Dynamic workflows involve Claude breaking tasks into subtasks, with agents running in parallel until results converge.
|
The Cursor Developer Habits Report (1 minute read)
Models now utilize more context to understand codebases, which reduces costs as input and cache-read tokens are cheaper than output tokens. This context-driven approach improves code calibration, increasing developer productivity and diff survival rates.
|
Multi-Agent World Models (3 minute read)
NVIDIA γ-World is a generative world model that supports independently controllable, permutation-symmetric agents and delivers real-time rollouts with zero-shot generalization from two-player to four-player settings.
|
|
Data Isn't Scarce. Your Imagination Is (8 minute read)
Asuka Zheng argues the "we're running out of training data" panic misses the actual shape of the data market, recounting her own SRE-replacement project that trained two world models until it stalled because end-to-end long-horizon incident trajectories from first anomaly to full resolution did not exist as a dataset.
|
|
IBM's "Project Lightwell" (1 minute read)
Project Lightwell will establish a trusted enterprise clearinghouse to serve as a security coordination layer to help enterprises integrate secure patches directly into their existing software supply chains with enterprise-grade validation and lifecycle management.
|
|
|
Want to advertise in TLDR? 📰
If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us.
Want to work at TLDR? 💼
| | | |