Simor Consulting

Category: MLOps

Model serving: vLLM, TGI, Triton — which fits your stack?
Model serving: vLLM, TGI, Triton — which fits your stack?
18 Jun, 2026 | 05 Mins read

Serving a language model in production is an infrastructure problem, not a model problem. The model weights are the same regardless of how you serve them. What differs is throughput (how many requests

The $2M model that never made it to production
The $2M model that never made it to production
09 Jun, 2026 | 05 Mins read

A retail chain with 400 stores spent two years and $2.1 million building an inventory optimization model. The model was technically excellent. It reduced predicted stockouts by thirty-two percent and

Feature store comparison: Feast, Tecton, Hopsworks
Feature store comparison: Feast, Tecton, Hopsworks
20 May, 2026 | 05 Mins read

Feature stores solve a specific problem: the features you use to train a model must be the same features you use to serve it. When the training pipeline computes features differently than the serving

AI Observability: Monitoring Drift, Data Quality & Model Performance
AI Observability: Monitoring Drift, Data Quality & Model Performance
12 Sep, 2025 | 02 Mins read

An insurance company's premium pricing model had been quietly going haywire for two weeks. Young drivers in high-risk areas were getting bargain prices while safe drivers faced astronomical quotes. By

Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
Serverless Machine Learning: Patterns with AWS Lambda, GCP Cloud Run & Azure Functions
18 Jul, 2025 | 05 Mins read

A social media analytics company watched their Kubernetes cluster fail to handle traffic spikes from trending topics. The cluster would scale from 50 to 500 pods in minutes, but not fast enough to pre

Incremental ML: Continuous Learning Systems
Incremental ML: Continuous Learning Systems
12 Jul, 2024 | 11 Mins read

Traditional ML trains on historical data, deploys, and waits until performance degrades. This fails in dynamic environments where data patterns evolve. Incremental ML continuously updates models as ne

Scaling Machine Learning Infrastructure: From POC to Production
Scaling Machine Learning Infrastructure: From POC to Production
10 May, 2024 | 04 Mins read

# Scaling Machine Learning Infrastructure: From POC to Production Moving a machine learning model from notebook to production exposes gaps that notebooks hide. Data scientists produce working models

Deploying ML Models on Kubernetes: Best Practices
Deploying ML Models on Kubernetes: Best Practices
06 May, 2024 | 03 Mins read

# Deploying ML Models on Kubernetes: Best Practices ML models in production need orchestration, scaling, and monitoring infrastructure. Kubernetes provides these capabilities, though the learning cur

MLOps vs DataOps: Understanding the Differences and Overlaps
MLOps vs DataOps: Understanding the Differences and Overlaps
08 Feb, 2024 | 03 Mins read

DataOps and MLOps both aim to improve reliability and efficiency in data-centric workflows, but they address different parts of the data science lifecycle. Understanding their boundaries helps organiz