Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

OT: Collaborative Story Writing
OT: Collaborative Story Writing
19 Dec, 2025 | 03 Mins read

Friends writing a story together, each with their own copy. Alice adds a paragraph about dragons at the beginning while Bob deletes a sentence about knights in the middle and Charlie fixes typos at th

2025 Year-in-Review & 2026 Trends in Data & AI Architecture
2025 Year-in-Review & 2026 Trends in Data & AI Architecture
19 Dec, 2025 | 03 Mins read

2025 was the year AI moved from experimentation to industrialization. While 2024 saw the explosion of generative AI capabilities, 2025 was about making those capabilities production-ready, cost-effect

Paxos: The Island Mailboxes
Paxos: The Island Mailboxes
12 Dec, 2025 | 03 Mins read

Remote islands must agree on decisions—when to hold festivals, which trading routes to use, who leads the council. Messages travel by boat, boats sink, islanders leave for fishing trips. How reach agr

Performance Engineering for GenAI Inference: Batching, Caching & Quantisation
Performance Engineering for GenAI Inference: Batching, Caching & Quantisation
12 Dec, 2025 | 05 Mins read

A startup's GenAI application cost $0.42 per query at 15-second latency. At this rate, their Series A funding would last six months. The problem wasn't the model—it was unoptimized inference. Each req

Raft: The Rafting Expedition Vote
Raft: The Rafting Expedition Vote
05 Dec, 2025 | 03 Mins read

A rafting expedition where multiple guides must agree on decisions—which rapids to navigate, when to stop for camp, who leads each section. Without consensus the expedition fragments. Raft consensus w

Case Study: End-to-End RAG Platform for Customer Support
Case Study: End-to-End RAG Platform for Customer Support
05 Dec, 2025 | 05 Mins read

A SaaS company with 200 support agents and 10,000+ knowledge base articles had an 18-hour average response time and 23% first-contact resolution. Their largest enterprise client threatened to cancel a

Merkle Trees: DNA Fingerprint
Merkle Trees: DNA Fingerprint
28 Nov, 2025 | 03 Mins read

Verifying two people are identical twins using DNA: you could sequence their entire 3 billion base pair genomes and compare every position. Or use genetic fingerprinting: hash specific DNA regions int

Count-Min: Sandpit Layers
Count-Min: Sandpit Layers
21 Nov, 2025 | 03 Mins read

Thousands of children play at a beach, each leaving footprints. Tracking each child's visits individually becomes impossible at scale. Instead, imagine multiple shallow sandpits with different grid pa

AI in Regulated Industries: Compliance Patterns for Finance & Healthcare
AI in Regulated Industries: Compliance Patterns for Finance & Healthcare
21 Nov, 2025 | 04 Mins read

Deploying AI in regulated industries—banks, insurance, healthcare—requires more than technical excellence. A model that's a black box cannot satisfy regulatory requirements for explainability. Trainin