Linear Group systems research

Inference, systems, and chips

Research notes on low-level systems and GPUs: KV cache management, quantization, PCIe, attention kernels, and disaggregated serving. Implementations, measurements, and implications.

Research notes

Background

Seven years in Linux kernel at IBM: architected I/O virtualization for POWER (rpadlpar, still in mainline), created librtas, led the international virtualization team, upstreamed across 12 kernel releases. Fifteen years of technology strategy and capital allocation at AT&T. Now building inference, GPU, and low-level systems full time.