Simor

Data Infrastructure for Production AI

Practical writing on AI data engineering, feature stores, and the infrastructure choices that determine whether AI systems work in production.

Responsible AI by Design: Embedding Ethics into Data Architecture
Responsible AI by Design: Embedding Ethics into Data Architecture
26 Mar, 2025 | 09 Mins read

AI systems increasingly make decisions that profoundly affect human lives. Healthcare systems deny treatment recommendations based on zip codes. Hiring platforms filter resumes based on gender. Crimin

Tracing Spans as Russian Nesting Dolls
Tracing Spans as Russian Nesting Dolls
21 Mar, 2025 | 03 Mins read

Russian nesting dolls (Matryoshka) are wooden dolls where each one opens to reveal a smaller doll inside, which opens to reveal another, and so on. Each doll represents an operation in your distribute

Securing the AI Supply Chain: From Data Ingestion to Model Deployment
Securing the AI Supply Chain: From Data Ingestion to Model Deployment
15 Mar, 2025 | 09 Mins read

AI systems introduce attack vectors that don't exist in traditional software. Unlike conventional applications that process data according to fixed rules, AI models learn from data, making them vulner

gRPC Postcards: Typed Messages at Light-Speed
gRPC Postcards: Typed Messages at Light-Speed
14 Mar, 2025 | 03 Mins read

A postal service where every postcard has a strict template. The address fields are always in the same spot. The message area has specific sections for specific types of information. Both sender and r

WebSockets: The Persistent Coffee Line
WebSockets: The Persistent Coffee Line
07 Mar, 2025 | 06 Mins read

You walk into your favorite coffee shop and order your usual. But instead of ordering, paying, leaving, and coming back when you want another coffee (like HTTP requests), imagine you could just stay a

Building AI-Ready Data Pipelines: Key Architecture Considerations
Building AI-Ready Data Pipelines: Key Architecture Considerations
04 Mar, 2025 | 02 Mins read

Data pipelines built for business intelligence often fail when supporting AI workloads. The root cause is usually architectural: BI pipelines assume bounded, relatively static datasets, while AI syste

Designing for Data Quality: How to Build Reliable AI Systems
Designing for Data Quality: How to Build Reliable AI Systems
26 Feb, 2025 | 02 Mins read

Most ML projects fail not because of flawed algorithms but because of poor data quality. Data scientists typically spend 80% of their time on data preparation, and even small data quality issues drama

From Data Silos to Data Mesh: The Evolution of Enterprise Data Architecture
From Data Silos to Data Mesh: The Evolution of Enterprise Data Architecture
15 Feb, 2025 | 03 Mins read

Traditional centralized data architectures worked for BI but struggle with AI workloads. Centralized teams become bottlenecks as data volumes grow. Domain experts who understand the data are separated

Real-Time Feature Engineering: The Key to Operational AI Systems
Real-Time Feature Engineering: The Key to Operational AI Systems
05 Feb, 2025 | 02 Mins read

Most AI pilots succeed. Most AI production deployments fail. The gap between proof-of-concept and operational AI often traces to one root cause: the inability to compute and serve features in real-tim