Posts

All the articles I've posted.

llm-concepts
6 May, 2026 7 min read

Tool Use, Function Calling, and MCP: How a Chatbot Became an Agent

Tools turn a chatbot into an agent. What function calling actually is, why MCP changed the rules, and the loop that makes a model do work.
llm-concepts
4 May, 2026 8 min read

Prompting and RAG: The Two Levers You Actually Pull

Most teams will never train a model. Most teams will spend a lot of time on prompts and retrieval. What the practical 2026 stack actually looks like.
llm-concepts
4 May, 2026 7 min read

Interpretability: What's Actually Inside

We can train a 70B model and watch it work. We mostly cannot explain why it works. Interpretability is the science trying to fix that.
llm-concepts
30 Apr, 2026 7 min read

Benchmarks: How Labs Measure Intelligence (and the Games They Play)

Every model launch comes with a chart. The numbers look big. What benchmarks actually measure, what they miss, and how labs game them.
ai
29 Apr, 2026 2 min read

AI Digest W18: A Frontier Launch Week

GPT-5.5 lands, DeepSeek V4 ships open weights at frontier-near quality, Gemini 3 Flash goes default, and ChatGPT grows arms.
llm-concepts
29 Apr, 2026 7 min read

Hallucinations and Jailbreaks: The Two Ways LLMs Fail

LLMs produce confident wrong answers and can be tricked into ignoring safety rules. What is actually happening and why both failures are hard to fix.
llm-concepts
28 Apr, 2026 8 min read

The 2026 Model Lineup: Who Ships What

A field guide to the 2026 frontier and open-weight model field, and a practical way to think about which model to actually pick.
llm-concepts
27 Apr, 2026 7 min read

Multimodality: Teaching Models to See and Hear

A multimodal model is not many models in a trench coat. It is one transformer trained to treat pixels, audio, and text as the same kind of thing.
llm-concepts
27 Apr, 2026 7 min read

Reasoning Models: Chain-of-Thought and Test-Time Compute

Reasoning models do not have a new architecture. They have a new training recipe and permission to think for longer before answering.