Posts

All the articles I've posted.

llm-concepts
15 May, 2026 7 min read

Quantization: How a 70B Model Fits on Your Laptop

Quantization shrinks a 70B model from 140 GB to 20 GB with almost no quality loss. What it actually does, and why the trick works.
llm-concepts
10 May, 2026 8 min read

Fine-tuning with LoRA: When to Change the Model, Not the Prompt

When does fine-tuning beat prompting, and what does LoRA actually cost? The decision ladder that saves most teams from training a model they did not need.
ai
6 May, 2026 2 min read

AI Digest W19: AI Goes to Wall Street

Anthropic and OpenAI both spin up Wall Street-backed enterprise AI ventures, Anthropic shows Outcomes and Dreaming, and an AI runs a cafe.
llm-concepts
6 May, 2026 7 min read

Tool Use, Function Calling, and MCP: How a Chatbot Became an Agent

Tools turn a chatbot into an agent. What function calling actually is, why MCP changed the rules, and the loop that makes a model do work.
llm-concepts
4 May, 2026 8 min read

Prompting and RAG: The Two Levers You Actually Pull

Most teams will never train a model. Most teams will spend a lot of time on prompts and retrieval. What the practical 2026 stack actually looks like.
llm-concepts
4 May, 2026 7 min read

Interpretability: What's Actually Inside

We can train a 70B model and watch it work. We mostly cannot explain why it works. Interpretability is the science trying to fix that.
llm-concepts
30 Apr, 2026 7 min read

Benchmarks: How Labs Measure Intelligence (and the Games They Play)

Every model launch comes with a chart. The numbers look big. What benchmarks actually measure, what they miss, and how labs game them.
ai
29 Apr, 2026 2 min read

AI Digest W18: A Frontier Launch Week

GPT-5.5 lands, DeepSeek V4 ships open weights at frontier-near quality, Gemini 3 Flash goes default, and ChatGPT grows arms.
llm-concepts
29 Apr, 2026 7 min read

Hallucinations and Jailbreaks: The Two Ways LLMs Fail

LLMs produce confident wrong answers and can be tricked into ignoring safety rules. What is actually happening and why both failures are hard to fix.