Simor
Data Infrastructure for Production AI
Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.
AI systems increasingly make decisions that profoundly affect human lives. Healthcare systems deny treatment recommendations based on zip codes. Hiring platforms filter resumes based on gender. Crimin
Russian nesting dolls (Matryoshka) are wooden dolls where each one opens to reveal a smaller doll inside, which opens to reveal another, and so on. Each doll represents an operation in your distribute
AI systems introduce attack vectors that don't exist in traditional software. Unlike conventional applications that process data according to fixed rules, AI models learn from data, making them vulner
A postal service where every postcard has a strict template. The address fields are always in the same spot. The message area has specific sections for specific types of information. Both sender and r
You walk into your favorite coffee shop and order your usual. But instead of ordering, paying, leaving, and coming back when you want another coffee (like HTTP requests), imagine you could just stay a
Data pipelines built for business intelligence often fail when supporting AI workloads. The root cause is usually architectural: BI pipelines assume bounded, relatively static datasets, while AI syste
Most ML projects fail not because of flawed algorithms but because of poor data quality. Data scientists typically spend 80% of their time on data preparation, and even small data quality issues drama
Traditional centralized data architectures worked for BI but struggle with AI workloads. Centralized teams become bottlenecks as data volumes grow. Domain experts who understand the data are separated
Most AI pilots succeed. Most AI production deployments fail. The gap between proof-of-concept and operational AI often traces to one root cause: the inability to compute and serve features in real-tim