Blog
Notes on local LLMs, hardware tradeoffs, and what actually moves token-per-second numbers.
Kimi K2.6 vs GLM-5.1: I ran both for hours so you don't have to
Two open-weight Chinese frontier models, two weeks apart, both claiming SOTA on agentic coding. Five real tasks via OpenCode, one verdict.
Kimi K2.6 vs Claude Opus 4.7
Five identical prompts, two harnesses. A frontier closed model against an open-weight underdog.
Welcome to the blog
A short hello and what to expect from the writing here.
Bandwidth, not FLOPS
Why memory bandwidth dictates local LLM throughput more than raw compute, and what that means for hardware choice.