Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

Metadata-Driven ELT: Designing Declarative Pipelines
Metadata-Driven ELT: Designing Declarative Pipelines
15 Aug, 2025 | 03 Mins read

A data engineer at an e-commerce company stared at a mess of SQL scripts, Python notebooks, and configuration files. What started as a simple ETL job had mutated into a hydra of interdependent transfo

Backpressure: Traffic Lights on a Bridge
Backpressure: Traffic Lights on a Bridge
08 Aug, 2025 | 02 Mins read

A narrow bridge holds 50 cars safely. When car 51 tries to enter, the light turns red. Cars queue on the approach road, then the streets leading to it, then the highways beyond. The bridge is protect

DataOps Automation with Dagster, Prefect 2 & Airflow 2
DataOps Automation with Dagster, Prefect 2 & Airflow 2
08 Aug, 2025 | 04 Mins read

A fintech company's data platform ground to a halt when a schema change cascaded through dozens of pipelines. Their homegrown orchestration system—a maze of cron jobs and bash scripts—offered no visib

Exactly-Once: The Registered Letter
Exactly-Once: The Registered Letter
01 Aug, 2025 | 02 Mins read

You're sending a $10,000 check. Regular mail might get lost. Send two copies, recipient might cash both. What you need: tracked, signed for, proof of delivery. Your check arrives exactly once. Not zer

Time-Series Forecasting Pipelines: From TSDB to Model Monitoring
Time-Series Forecasting Pipelines: From TSDB to Model Monitoring
01 Aug, 2025 | 04 Mins read

An energy company's AI predicted electricity demand would peak at 6 PM, as typical. The first game of the World Cup had millions turning on TVs at 4 PM, creating an unprecedented spike their models co

Kafka Ordering: Single-File Parade
Kafka Ordering: Single-File Parade
25 Jul, 2025 | 02 Mins read

A parade where everyone maintains exact position. The drummer at position 10 stays at position 10. The flag bearer at position 50 remains at position 50. Even if they take breaks, when they reassemble

Knowledge Graphs for Context-Rich AI
Knowledge Graphs for Context-Rich AI
25 Jul, 2025 | 04 Mins read

A pharmaceutical company's language model could discuss individual molecules but failed to understand that Drug A inhibited the same enzyme Drug B required for activation—a critical interaction that m

Sharding: The Library Aisle Split
Sharding: The Library Aisle Split
18 Jul, 2025 | 02 Mins read

Central Library started small: one room, one librarian, manageable. Now it holds millions of books. Patrons wait hours. The librarian hasn't slept in weeks. The solution: split the library. Fiction (

Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
18 Jul, 2025 | 05 Mins read

A social media analytics company watched their Kubernetes cluster fail to handle traffic spikes from trending topics. The cluster would scale from 50 to 500 pods in minutes, but not fast enough to pre