Data mesh in practice: year 2 retrospective

Data mesh in practice: year 2 retrospective

Simor Consulting | 16 Jun, 2026 | 05 Mins read

An insurance company with $400 million in premium volume adopted data mesh two years ago. The central data team had become a bottleneck. Every business unit — claims, underwriting, actuarial, and distribution — submitted data requests to a single queue. The queue was six months long. Business units were building shadow data pipelines in spreadsheets and Access databases because the central team could not serve them fast enough.

The CDO read the data mesh paper, consulted with two firms, and committed to the four principles: domain ownership of data, data as a product, self-serve data platform, and federated computational governance. The central data team was restructured. Each business unit was assigned a data product owner. The platform team built self-service tooling for data publishing, discovery, and access control.

Two years in, the organization has clear wins, clear failures, and a set of trade-offs that were not in the original vision.

What worked

Domain ownership eliminated the queue. Each business unit now owns its data products and publishes them on its own timeline. Claims publishes settlement data within one business day of transaction. Underwriting publishes risk score distributions weekly. Actuarial publishes loss ratio tables monthly. Before data mesh, all of these went through the central team, which processed them in priority order based on who had the most urgent request.

Data quality improved in an unexpected way. When the claims team owned their data products, they started caring about data quality in their source systems. Under the old model, the central team cleaned claims data as part of the ETL pipeline. The claims team had no incentive to fix data quality issues at the source, because the central team would catch and correct them. Under data mesh, the claims team publishes their own data products. If the data is wrong, their downstream consumers notice, and the claims team gets the support tickets. The feedback loop from consumer to producer created accountability that the centralized model never achieved.

Data discovery improved. The self-serve platform included a data catalog where each team registered their data products with schemas, descriptions, and quality metrics. Business analysts who previously spent days tracking down the right dataset could now search the catalog, read the data product documentation, and request access in minutes. The catalog became the most used tool in the data platform.

What did not work

Federated governance was the principle that came closest to failing. The original vision was that governance policies — data quality standards, access control rules, retention policies — would be defined centrally and enforced computationally at the platform level. In practice, the computational governance tooling was immature. Access control worked. Quality enforcement did not. Teams published data products that did not meet the centrally defined quality standards, and the platform had no mechanism to block publication.

The result was a two-tier data catalog. Seventy percent of data products met the quality standards and were trusted by consumers. Thirty percent did not, and consumers learned to check the quality metrics before using a data product. This created a manual quality assessment step that the self-serve platform was supposed to eliminate.

This diagram requires JavaScript.

Enable JavaScript in your browser to use this feature.

The second failure was skill distribution. The data mesh model assumes that each domain team has the skills to build and maintain data products. The claims team did. The underwriting team partially did. The distribution team did not. They had business analysts who could use data tools but not build data pipelines. The central data team ended up building the distribution team’s data products for them, which recreated the bottleneck that data mesh was supposed to eliminate.

The third failure was cross-domain data products. Some of the most valuable analytics spanned multiple domains — a combined view of claims, underwriting, and actuarial data for loss ratio analysis. Under the centralized model, the central team built these cross-domain views. Under data mesh, no single domain owned the cross-domain product. The original design called for “virtual domain teams” that would assemble cross-domain data products. In practice, these teams were never staffed because no business unit wanted to assign their people to a team they did not control.

What we learned

Data mesh works when the domains are genuinely independent. Claims data, underwriting data, and actuarial data have natural boundaries. Each domain has its own source systems, its own consumers, and its own quality standards. Domain ownership for these products is natural and sustainable.

Data mesh struggles when the domains are interdependent. Cross-domain analytics, shared entity resolution, and enterprise-wide reporting do not fit neatly into domain ownership. These capabilities require either a central team or a collaborative model that the federated structure does not naturally support.

The practical compromise: a thin central team that focuses exclusively on cross-domain data products and governance tooling. This team is small — four people, compared to the eighteen-person central team before data mesh. Their mandate is narrow: build and maintain the cross-domain data products that no single domain can own, and invest in governance tooling that makes the federated model self-enforcing rather than self-policed.

Results at year two

Data request fulfillment time dropped from six months average to two weeks average for domain-specific requests. Cross-domain requests still take four to eight weeks because they require coordination between domains, but this is still faster than the old centralized queue.

Data quality incidents increased in the first year as domains took ownership and discovered issues that the central team had been silently correcting. By year two, incident rates dropped below pre-mesh levels because the source systems improved. The first-year spike was painful but necessary — it surfaced quality debt that had been hidden by the centralized cleaning pipeline.

Annual data platform cost decreased by fifteen percent, primarily because the central team was smaller and the domains absorbed data engineering costs into their own budgets. Total organizational spend on data increased by eight percent, reflecting the investment in domain data teams. Net effect: the organization spent slightly more on data but delivered significantly more data products.

The decision heuristic

Adopt data mesh when your bottleneck is the central team’s capacity to serve domain-specific requests, and when your domains have genuinely independent data sources and consumers. Do not adopt data mesh when your primary analytics workload is cross-domain, when your domains lack data engineering skills, or when your governance tooling cannot enforce quality standards computationally. Data mesh trades central coordination costs for distributed coordination costs. If your organization cannot absorb the distributed coordination overhead, the central team may actually be the more efficient model, despite the queue.

Shipping a production AI system?

Find the control gaps before they turn into incidents. Take the AI Production Scorecard for a fast baseline across the seven layers, or book an architecture review and we will turn it into a hardening plan.

Similar Articles

The Modern Data Stack for AI Readiness: Architecture and Implementation
The Modern Data Stack for AI Readiness: Architecture and Implementation
28 Jan, 2025 | 03 Mins read

Existing data infrastructure often cannot support ML workflows. The modern data stack offers a foundation, but it requires adaptation to become AI-ready. This article covers building a data architectu

The data pipeline that cost $50K/month — and the audit that found why
The data pipeline that cost $50K/month — and the audit that found why
22 Apr, 2026 | 04 Mins read

A financial services firm running analytics on trade settlement data came to us with a specific complaint: their cloud data platform cost had tripled in eighteen months, and nobody could explain why.

dbt vs SQLMesh: which transformation tool wins in 2026?
dbt vs SQLMesh: which transformation tool wins in 2026?
23 Apr, 2026 | 06 Mins read

Every analytics team eventually faces the same choice: how do you transform raw data into something analysts can actually use? For years, dbt was the only serious answer. SQLMesh arrived with a differ

How a retailer reduced inference latency 90% with feature store caching
How a retailer reduced inference latency 90% with feature store caching
21 Apr, 2026 | 04 Mins read

A mid-market e-commerce retailer with roughly $200M in annual revenue had invested eighteen months building a product recommendation engine. The models were accurate. Offline evaluation showed meaning

Migrating from batch to streaming: a 6-month journey
Migrating from batch to streaming: a 6-month journey
28 Apr, 2026 | 05 Mins read

A logistics company processing two million shipments per day ran their entire operational reporting stack on nightly batch ETL. Every morning at 6 AM, operations managers reviewed dashboards built on

When RAG failed: a knowledge retrieval project post-mortem
When RAG failed: a knowledge retrieval project post-mortem
29 Apr, 2026 | 05 Mins read

A legal technology company had invested six months building a retrieval-augmented generation system to help contract attorneys find relevant precedent clauses across a corpus of 180,000 executed agree

Data Lakehouse Security Best Practices
Data Lakehouse Security Best Practices
22 Feb, 2024 | 02 Mins read

Data lakehouses combine lake flexibility with warehouse performance but introduce security challenges from their hybrid nature. Securing these environments requires layered approaches covering authent

From 3-hour dashboards to 3-minute insights: a BI modernization story
From 3-hour dashboards to 3-minute insights: a BI modernization story
05 May, 2026 | 05 Mins read

A manufacturing company with facilities in twelve countries ran its operational reporting on a traditional BI stack: a data warehouse, an ETL pipeline, and a dashboard tool that had been deployed six

Orchestration face-off: Airflow vs Prefect vs Dagster
Orchestration face-off: Airflow vs Prefect vs Dagster
07 May, 2026 | 06 Mins read

The orchestration market has a clear incumbent and two serious challengers. Apache Airflow has been the default choice since 2015. Prefect and Dagster both emerged to address Airflow's pain points, bu

The vector database that couldn't scale — and what we did instead
The vector database that couldn't scale — and what we did instead
12 May, 2026 | 05 Mins read

A media company with a library of twelve million articles, transcripts, and research documents had built a semantic search system on a managed vector database. The system was designed to let journalis

Building an AI operating system for a 10,000-person company
Building an AI operating system for a 10,000-person company
19 May, 2026 | 05 Mins read

A diversified industrial company with 10,000 employees across manufacturing, logistics, and field services had accumulated forty-seven separate AI projects over three years. Each business unit had bui

Real-time streaming: Kafka vs Redpanda vs Pulsar
Real-time streaming: Kafka vs Redpanda vs Pulsar
21 May, 2026 | 05 Mins read

Kafka has dominated event streaming for a decade. It processes trillions of messages daily across thousands of companies. Its dominance created an ecosystem so large that "streaming" became synonymous

How we killed our ETL pipeline (and productivity went up)
How we killed our ETL pipeline (and productivity went up)
26 May, 2026 | 05 Mins read

A B2B SaaS company running a customer success platform had a data pipeline that consumed sixty percent of the data engineering team's time. Not feature work. Not analytics. Pipeline maintenance. The p

A compliance-first AI rollout in financial services
A compliance-first AI rollout in financial services
03 Jun, 2026 | 05 Mins read

A regional bank with $12 billion in assets wanted to use machine learning to improve its commercial loan underwriting process. The existing process was manual, relying on credit analysts who spent fou

Semantic Layer Implementation: Challenges and Solutions
Semantic Layer Implementation: Challenges and Solutions
20 Mar, 2024 | 02 Mins read

A semantic layer provides business-friendly abstraction over technical data structures, enabling self-service analytics and consistent metric interpretation. Implementing one involves technical challe

The $2M model that never made it to production
The $2M model that never made it to production
09 Jun, 2026 | 05 Mins read

A retail chain with 400 stores spent two years and $2.1 million building an inventory optimization model. The model was technically excellent. It reduced predicted stockouts by thirty-two percent and

Data cataloging tools: Atlan, Alation, DataHub, Amundsen
Data cataloging tools: Atlan, Alation, DataHub, Amundsen
11 Jun, 2026 | 05 Mins read

A data catalog solves a trust problem. When an analyst cannot find the right table, does not know what a column means, or cannot tell whether data is fresh, they either guess or ask someone. Both outc

Serverless Data Pipelines: Architecture Patterns
Serverless Data Pipelines: Architecture Patterns
05 Jun, 2024 | 08 Mins read

# Serverless Data Pipelines: Architecture Patterns Serverless computing eliminates server management and provides automatic scaling with pay-per-use billing. These benefits matter for data pipelines

Event-Driven Data Architecture
Event-Driven Data Architecture
15 Sep, 2024 | 02 Mins read

Event-driven architectures treat changes in state as events that trigger immediate actions and data flows. Rather than processing data in batches or through scheduled jobs, components react to changes

From Data Silos to Data Mesh: The Evolution of Enterprise Data Architecture
From Data Silos to Data Mesh: The Evolution of Enterprise Data Architecture
15 Feb, 2025 | 03 Mins read

Traditional centralized data architectures worked for BI but struggle with AI workloads. Centralized teams become bottlenecks as data volumes grow. Domain experts who understand the data are separated

Case Study: End-to-End RAG Platform for Customer Support
Case Study: End-to-End RAG Platform for Customer Support
05 Dec, 2025 | 05 Mins read

A SaaS company with 200 support agents and 10,000+ knowledge base articles had an 18-hour average response time and 23% first-contact resolution. Their largest enterprise client threatened to cancel a

Case Study: Building a Production AI Knowledge Layer for Financial Services
Case Study: Building a Production AI Knowledge Layer for Financial Services
01 Mar, 2026 | 10 Mins read

A regional bank's investment research team spent 60% of their time gathering information and 40% doing analysis. Analysts had to search through regulatory filings, internal research memos, market data

Feature Stores for AI: The Missing MLOps Component Reaching Maturity
Feature Stores for AI: The Missing MLOps Component Reaching Maturity
12 Mar, 2026 | 11 Mins read

A recommendation system team built their tenth model. Each model required feature engineering. Each feature engineering project started by copying code from the previous project, then modifying it for

Case Study: Multi-Agent System for Supply Chain Optimization
Case Study: Multi-Agent System for Supply Chain Optimization
13 Jun, 2026 | 12 Mins read

A mid-size automotive parts manufacturer with operations spanning 15 countries and relationships with over 200 suppliers faced a supply chain coordination problem that was consuming too much of their

The AI Data Pipeline: Special Considerations for Unstructured and Structured Data
The AI Data Pipeline: Special Considerations for Unstructured and Structured Data
11 May, 2026 | 13 Mins read

Data pipelines for AI are not the same as data pipelines for traditional software systems. The outputs are different. The failure modes are different. The tolerance for data quality issues is differen