Simor
Data Infrastructure for Production AI
Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.
A B2B SaaS company running a customer success platform had a data pipeline that consumed sixty percent of the data engineering team's time. Not feature work. Not analytics. Pipeline maintenance. The p
A hiring manager at a large tech company told me they had four hundred engineers working on their AI platform and zero people with training in philosophy, ethics, or the social sciences. When I asked
LLM inference costs follow a pattern that catches teams off guard. The first prototype costs almost nothing -- a few hundred dollars a month during development. The pilot scales to a few thousand. Pro
Data Council 2026 wrapped in Austin last week, and the signal-to-noise ratio was higher than in recent years. The conference has historically been the venue where data infrastructure practitioners — n
You put on your seatbelt every time you get in a car. You hope never to need it. If you do need it, you want it to work. The seatbelt's value is entirely conditional on something you hope never happen
Prompts are not prompts in the casual sense of suggestions or starting points. They are software. They take inputs, produce outputs, have failure modes that manifest in specific conditions, and requir
Kafka has dominated event streaming for a decade. It processes trillions of messages daily across thousands of companies. Its dominance created an ecosystem so large that "streaming" became synonymous
Feature stores solve a specific problem: the features you use to train a model must be the same features you use to serve it. When the training pipeline computes features differently than the serving
A diversified industrial company with 10,000 employees across manufacturing, logistics, and field services had accumulated forty-seven separate AI projects over three years. Each business unit had bui