Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

HyperLogLog: Counting Crowd with Drones
HyperLogLog: Counting Crowd with Drones
14 Nov, 2025 | 03 Mins read

Counting attendees at a massive festival: individual counting requires massive infrastructure for millions of attendees. Sampling small areas and extrapolating fails with uneven crowd distribution. Th

Benchmarking Vector Databases: Performance, Cost & Ecosystem
Benchmarking Vector Databases: Performance, Cost & Ecosystem
14 Nov, 2025 | 05 Mins read

A RAG application that works perfectly with toy datasets grinds to a halt at production scale. The vector database that benchmarked beautifully with 10K vectors performs terribly at 10M. The one that

Tries: The Word Ladder
Tries: The Word Ladder
07 Nov, 2025 | 03 Mins read

Word ladder games start with "CAT", change one letter to get "COT", then "DOT", then "DOG". Now imagine all possible words connected in a web where shared prefixes create natural pathways. That's a tr

Semantic Layers & Metrics Stores: dbt Semantic Layer, Cube, Transform
Semantic Layers & Metrics Stores: dbt Semantic Layer, Cube, Transform
07 Nov, 2025 | 05 Mins read

Every team has their own definition of "revenue." The CFO calculates it one way, marketing another, and product a third. Each calculation is technically correct—they just use different definitions, ti

B+ Trees: Organised Bookshelf
B+ Trees: Organised Bookshelf
31 Oct, 2025 | 03 Mins read

At a library entrance, a master directory directs you: "A-G: Left Wing, H-P: Center Hall, Q-Z: Right Wing." You head to the Right Wing where another sign says "Q-S: Aisle 1-3, T-V: Aisle 4-6." Followi

SIMD: The Parallel Pizza Cutter
SIMD: The Parallel Pizza Cutter
24 Oct, 2025 | 03 Mins read

Picture a pizza shop on Friday night. Method one: single pizza cutter, cut one line at a time, eight cuts for eight slices. Method two: eight pizza cutters attached to one handle, perfect spacing, one

Multimodal AI Systems: Combining Text, Image & Audio Data
Multimodal AI Systems: Combining Text, Image & Audio Data
24 Oct, 2025 | 06 Mins read

Human communication is multimodal: we gesture while speaking, draw diagrams while explaining, and understand meaning through the interplay of sensory inputs. Yet most AI systems operate in silos—compu

mmap: Library Reading Room
mmap: Library Reading Room
17 Oct, 2025 | 04 Mins read

Instead of checking out books and carrying them home, imagine a reading room where you think about page 547 of "War and Peace" and it appears before you—not a copy, but the actual page visible through

Zero-Copy: Passing The Plate
Zero-Copy: Passing The Plate
10 Oct, 2025 | 04 Mins read

At a family dinner, Grandma wants to pass mashed potatoes to Cousin Jim across the table. The inefficient approach: Grandma scoops potatoes onto her plate, passes to Uncle Bob, who scoops onto his pla