Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

Knowledge Graphs for Context-Rich AI
Knowledge Graphs for Context-Rich AI
25 Jul, 2025 | 04 Mins read

A pharmaceutical company's language model could discuss individual molecules but failed to understand that Drug A inhibited the same enzyme Drug B required for activation—a critical interaction that m

Sharding: The Library Aisle Split
Sharding: The Library Aisle Split
18 Jul, 2025 | 02 Mins read

Central Library started small: one room, one librarian, manageable. Now it holds millions of books. Patrons wait hours. The librarian hasn't slept in weeks. The solution: split the library. Fiction (

Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
18 Jul, 2025 | 05 Mins read

A social media analytics company watched their Kubernetes cluster fail to handle traffic spikes from trending topics. The cluster would scale from 50 to 500 pods in minutes, but not fast enough to pre

ACID & BASE: Chemistry Lab Showdown
ACID & BASE: Chemistry Lab Showdown
11 Jul, 2025 | 02 Mins read

Two chemistry labs, different philosophies. ACID lab: Every experiment follows strict protocols. Reactions complete perfectly or not at all. Measurements are exact. Nothing proceeds until everything

Optimising Cloud AI Costs: Rightsizing Compute & Storage
Optimising Cloud AI Costs: Rightsizing Compute & Storage
11 Jul, 2025 | 06 Mins read

A fintech startup's cloud bill grew from $50,000 to $800,000 per month in six months. GPU clusters sat idle between training runs. Terabytes of experimental data accumulated in premium storage. Develo

Consistent Hashing: The Pizza Slice Wheel
Consistent Hashing: The Pizza Slice Wheel
04 Jul, 2025 | 03 Mins read

Imagine arranging pizza party guests on a circle, dividing it like pizza slices. Each station serves a section. When a guest leaves, only their immediate neighbors shift slightly. The rest stay where

Library Book Whisperer
Library Book Whisperer
27 Jun, 2025 | 03 Mins read

A library maintains an unofficial whisper network. A patron asks about a book, and a librarian remembers: "Sarah at the reference desk has it." This network bypasses the official catalog, turning hour

Federated Learning in the Enterprise: Architecture & MLOps
Federated Learning in the Enterprise: Architecture & MLOps
27 Jun, 2025 | 06 Mins read

A hospital network had data from 47 hospitals. They had top data scientists. They could not combine the data. Legal teams cited privacy regulations. Hospital administrators worried about competitive a

Embeddings: GPS for Words
Embeddings: GPS for Words
20 Jun, 2025 | 05 Mins read

Embeddings assign numerical coordinates to words and concepts. "Cat" sits near "kitten" and "feline" but far from "airplane." "Paris" neighbors "France" and "Eiffel Tower" but distances itself from "T