Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

How we killed our ETL pipeline (and productivity went up)
How we killed our ETL pipeline (and productivity went up)
26 May, 2026 | 05 Mins read

A B2B SaaS company running a customer success platform had a data pipeline that consumed sixty percent of the data engineering team's time. Not feature work. Not analytics. Pipeline maintenance. The p

Why your AI team needs philosophers, not just engineers
Why your AI team needs philosophers, not just engineers
25 May, 2026 | 05 Mins read

A hiring manager at a large tech company told me they had four hundred engineers working on their AI platform and zero people with training in philosophy, ethics, or the social sciences. When I asked

A cost optimization framework for LLM inference
A cost optimization framework for LLM inference
24 May, 2026 | 06 Mins read

LLM inference costs follow a pattern that catches teams off guard. The first prototype costs almost nothing -- a few hundred dollars a month during development. The pilot scales to a few thousand. Pro

Conference report: key takeaways from Data Council 2026
Conference report: key takeaways from Data Council 2026
23 May, 2026 | 04 Mins read

Data Council 2026 wrapped in Austin last week, and the signal-to-noise ratio was higher than in recent years. The conference has historically been the venue where data infrastructure practitioners — n

AI Safety: The Seatbelt
AI Safety: The Seatbelt
22 May, 2026 | 09 Mins read

You put on your seatbelt every time you get in a car. You hope never to need it. If you do need it, you want it to work. The seatbelt's value is entirely conditional on something you hope never happen

Prompt Engineering as Infrastructure: Version Control, Testing, and Deployment
Prompt Engineering as Infrastructure: Version Control, Testing, and Deployment
22 May, 2026 | 11 Mins read

Prompts are not prompts in the casual sense of suggestions or starting points. They are software. They take inputs, produce outputs, have failure modes that manifest in specific conditions, and requir

Real-time streaming: Kafka vs Redpanda vs Pulsar
Real-time streaming: Kafka vs Redpanda vs Pulsar
21 May, 2026 | 05 Mins read

Kafka has dominated event streaming for a decade. It processes trillions of messages daily across thousands of companies. Its dominance created an ecosystem so large that "streaming" became synonymous

Feature store comparison: Feast, Tecton, Hopsworks
Feature store comparison: Feast, Tecton, Hopsworks
20 May, 2026 | 05 Mins read

Feature stores solve a specific problem: the features you use to train a model must be the same features you use to serve it. When the training pipeline computes features differently than the serving

Building an AI operating system for a 10,000-person company
Building an AI operating system for a 10,000-person company
19 May, 2026 | 05 Mins read

A diversified industrial company with 10,000 employees across manufacturing, logistics, and field services had accumulated forty-seven separate AI projects over three years. Each business unit had bui