TLDR AI 2026-05-20

Building AI for the 80% of the world that doesn't think in English? (Sponsor)

If you're relying on machine-translated data to train your AI, you're missing the sort of cultural nuance that only a local would pick up on.

Welo Data offers AI training data and human evaluation in 155+ locales. That way, you can ensure:

✅ Your AI reflects cultural realities where your users actually are - not how English maps onto them

✅ Security guardrails translate across language groups

✅ Cultural nuance is caught by native experts, not flagged in post-production

Don't let rote translation erode the trust you've built. Get in touch

🚀

Headlines & Launches

Gemini 3.5 Flash (5 minute read)

Google has introduced Gemini 3.5 Flash, a new model focused on agentic workflows, coding, and long-horizon task execution. The release also expanded Gemini access across Search, enterprise tools, Android Studio, and Google's developer platforms.

OpenAI announces new Guaranteed Capacity offering for customers to secure compute (3 minute read)

OpenAI's new Guaranteed Capacity offering allows customers to secure long-term access to compute to power AI products, agents, and workflows. Customers can choose between one-, two-, and three-year-long commitments, with discounts based on how long the commitment is. The company will offer Guaranteed Capacity until it sells out of its current allocation. It plans to offer it again in the future.

Karpathy joins Anthropic (1 minute read)

Andrej Karpathy announced he has joined Anthropic, citing the next few years at the LLM frontier as especially formative for his return to R&D. Karpathy noted he remains passionate about education and plans to resume that work later, signaling the move is research-focused rather than a permanent pivot away from teaching.

🧠

Deep Dives & Analysis

Google Detailed the Shift Toward Agentic Gemini Products (19 minute read)

At I/O 2026, Google outlined how Gemini models were being integrated across consumer products, creative tools, and developer platforms. The company also shared that monthly token usage across its AI systems had grown to more than 3.2 quadrillion.

Model half-life (4 minute read)

Model half-life - the idea that model releases will keep getting faster and faster - doesn't really make sense under consideration. Model releases have definitely increased in pace, but the release time isn't halving every six months. This post looks at model release dates for several of the most well-known models and lays out predictions for when the next models might come out.

🧑‍💻

Engineering & Research

Stop stitching databases for AI agents (Sponsor)

Oracle AI Database acts as a unified memory core for agents. Vector search, relational, JSON, and graph data live together so agents can reason over live enterprise data without extra vector stores, pipelines, or synchronization jobs.
See how developers build agent memory →

Using Claude Code: The unreasonable effectiveness of HTML (10 minute read)

HTML's richness allows it to convey complex information more effectively than Markdown, including layouts, data tables, and interactive elements. It enhances readability by organizing specs into well-structured, easily navigable documents and offers better sharing and interaction capabilities. Claude Code uses HTML to efficiently ingest context from various sources, aiding in specs, design prototyping, and creating custom editing interfaces with improved engagement and clarity.

OlmoEarth v1.1: A more efficient family of models (5 minute read)

OlmoEarth v1.1, a new model family, reduces compute costs by up to 3X while maintaining performance, making planet-scale mapping more affordable. The models efficiently process remote sensing data by optimizing token sequence lengths, crucial for reducing computational costs. Methodological improvements allow similar performance to the original version with significantly less compute, benefiting developers and enhancing scientific research in remote sensing.

Real-Time Long Video Generation (GitHub Repo)

NVIDIA's LongLive 1.0 is a framework for interactive long-form video generation that supports sequential prompting and real-time user-guided editing using streaming attention and KV-cache optimization techniques.

🎁

Miscellaneous

A single pane of glass for managing all of your cloud agents (5 minute read)

Oz is a multi-harness control plane for cloud agents, supporting Claude Code, Codex, and Warp Agent. It offers automatic multi-agent orchestration, cross-harness Agent Memory, and improved cost and usage controls. Additionally, Oz provides expanded self-hosting options and enhanced governance features, streamlining agent management and deployment.

Introducing the Ettin Reranker Family (19 minute read)

Six new state-of-the-art CrossEncoder Ettin rerankers built on Ettin ModernBERT encoders have been released, offering models from 17M to 1B parameters. These models, trained with pointwise MSE distillation from a strong 1.54B parameter teacher, provide significant accuracy improvements over legacy models while enhancing speed, especially with Flash Attention 2. The models are particularly notable for their efficiency in retrieve-then-rerank systems and outperform models like ms-marco-MiniLM-L12-v2 on MTEB and NanoBEIR benchmarks.

⚡

Quick Links

⚡ See how WHOOP, Stripe, and DoorDash use AI to listen to their customers (Sponsor)

Unwrap auto-categorizes customer feedback with AI, surfaces real-time sentiment alerts, and lets you query insights via MCP. TLDR subscribers get a free trial — grab time with the team to get set up →

Index (2 minute read)

Index is a platform for content owners that helps them understand how AI agents use their work and earn revenue when they do.

Cerebras is now running Kimi K2.6 (1 minute read)

Kimi K2.6, a trillion-parameter model, has the fastest frontier model performance ever measured by Artificial Analysis at around 1,000 tokens per second.

TLDR is hiring a Senior Software Engineer, Applied AI ($250k-$350k, Fully Remote)

TLDR's Applied AI team is tasked with making every process at TLDR legible to code, runnable by anyone, and composable into larger workflows. Join a small, fast moving team using the latest AI tools with an unlimited token budget. Learn more.

Advancing content provenance for a safer, more transparent AI ecosystem (6 minute read)

OpenAI strengthens content provenance by implementing C2PA standards and Google DeepMind's SynthID watermarking for AI-generated images.

The third wave of American philanthropy (23 minute read)

AI is about to generate hundreds of billions in new philanthropic funding.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of AI professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? 💼

Apply here, create your own role or send a friend's resume to [email protected] and get $1k if we hire them! TLDR is one of Inc.'s Best Bootstrapped businesses of 2025.

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Andrew Tan, Ali Aminian, & Jacob Turner

Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR AI isn't for you, please unsubscribe.

2026-05-20 - TLDR AI

TLDR AI 2026-05-20

Headlines & Launches

Deep Dives & Analysis

Engineering & Research

Miscellaneous

Quick Links

Keep Reading

TLDR AI