Skip to content
Tomasus
Go back

AI Digest W18: A Frontier Launch Week

2 min read

Three big labs all shipped on the same handful of days, and a fourth quietly dropped open weights that look uncomfortably close to the frontier. If last week was about pricing tiers and policy papers, this week was about new metal hitting the road. Here is what shifted.

AI Digest hero

OpenAI announced GPT-5.5 and GPT-5.5 Pro on April 23, calling it their smartest and most intuitive model yet, with a clear push toward long-horizon work across coding, browsing, spreadsheets and operating other software. The benchmark posts that followed leaned hard on token efficiency and computer-use behaviour rather than raw reasoning numbers, which Simon Willison tested with the usual pelican drawing via the Codex backdoor. TechCrunch framed it as a step toward an AI super app, which is the kind of phrase that sounds bigger than it is, but the direction is real.

The same week Google made Gemini 3 Flash the default model in the Gemini app and AI Mode in Search, pitching Pro-grade reasoning at Flash-tier speed and price. Quietly more interesting was Gemini Robotics-ER 1.6, an upgrade to their embodied reasoning model that lets robots interpret physical environments with much more precision. Frontier intelligence stops being only about chat when it starts moving things in rooms.

Then DeepSeek showed up on April 24 with V4-Flash and V4-Pro in preview, an open-weights MoE pair with a one-million-token context that the team claims agents can actually use. The Hugging Face writeup puts Pro at 1.6T total parameters with 49B active and prices it at $1.74 in, $3.48 out per million tokens, well under the closed frontier. Tencent also open-sourced its Hy3 preview weights the day before. The gap between closed and open is not gone, but the price-to-quality curve keeps bending in a way that should make somebody at the big labs nervous.

On the agent side, OpenAI also introduced workspace agents in ChatGPT, letting it move across documents and tools to finish a task rather than just answer a question. The model launches grab the headlines, but this is where the workflow actually changes. What to watch next: how fast V4-Pro shows up in real agent stacks once the preview wrappers settle.

T.


Share this post on:

About Tomasus

Someone who wants to understand what is coming and how it will impact us as human beings. Writing notes on AI, cybersecurity, history, and staying sane.


Series: AI Digest


Related Posts


Previous Post
Benchmarks: How Labs Measure Intelligence (and the Games They Play)
Next Post
Hallucinations and Jailbreaks: The Two Ways LLMs Fail