Data mesh in practice: year 2 retrospective

Simor Consulting | 16 Jun, 2026 | 05 Mins read

An insurance company with $400 million in premium volume adopted data mesh two years ago. The central data team had become a bottleneck. Every business unit — claims, underwriting, actuarial, and distribution — submitted data requests to a single queue. The queue was six months long. Business units were building shadow data pipelines in spreadsheets and Access databases because the central team could not serve them fast enough.

The CDO read the data mesh paper, consulted with two firms, and committed to the four principles: domain ownership of data, data as a product, self-serve data platform, and federated computational governance. The central data team was restructured. Each business unit was assigned a data product owner. The platform team built self-service tooling for data publishing, discovery, and access control.

Two years in, the organization has clear wins, clear failures, and a set of trade-offs that were not in the original vision.

What worked

Domain ownership eliminated the queue. Each business unit now owns its data products and publishes them on its own timeline. Claims publishes settlement data within one business day of transaction. Underwriting publishes risk score distributions weekly. Actuarial publishes loss ratio tables monthly. Before data mesh, all of these went through the central team, which processed them in priority order based on who had the most urgent request.

Data quality improved in an unexpected way. When the claims team owned their data products, they started caring about data quality in their source systems. Under the old model, the central team cleaned claims data as part of the ETL pipeline. The claims team had no incentive to fix data quality issues at the source, because the central team would catch and correct them. Under data mesh, the claims team publishes their own data products. If the data is wrong, their downstream consumers notice, and the claims team gets the support tickets. The feedback loop from consumer to producer created accountability that the centralized model never achieved.

Data discovery improved. The self-serve platform included a data catalog where each team registered their data products with schemas, descriptions, and quality metrics. Business analysts who previously spent days tracking down the right dataset could now search the catalog, read the data product documentation, and request access in minutes. The catalog became the most used tool in the data platform.

What did not work

Federated governance was the principle that came closest to failing. The original vision was that governance policies — data quality standards, access control rules, retention policies — would be defined centrally and enforced computationally at the platform level. In practice, the computational governance tooling was immature. Access control worked. Quality enforcement did not. Teams published data products that did not meet the centrally defined quality standards, and the platform had no mechanism to block publication.

The result was a two-tier data catalog. Seventy percent of data products met the quality standards and were trusted by consumers. Thirty percent did not, and consumers learned to check the quality metrics before using a data product. This created a manual quality assessment step that the self-serve platform was supposed to eliminate.

This diagram requires JavaScript.

Enable JavaScript in your browser to use this feature.

The second failure was skill distribution. The data mesh model assumes that each domain team has the skills to build and maintain data products. The claims team did. The underwriting team partially did. The distribution team did not. They had business analysts who could use data tools but not build data pipelines. The central data team ended up building the distribution team’s data products for them, which recreated the bottleneck that data mesh was supposed to eliminate.

The third failure was cross-domain data products. Some of the most valuable analytics spanned multiple domains — a combined view of claims, underwriting, and actuarial data for loss ratio analysis. Under the centralized model, the central team built these cross-domain views. Under data mesh, no single domain owned the cross-domain product. The original design called for “virtual domain teams” that would assemble cross-domain data products. In practice, these teams were never staffed because no business unit wanted to assign their people to a team they did not control.

What we learned

Data mesh works when the domains are genuinely independent. Claims data, underwriting data, and actuarial data have natural boundaries. Each domain has its own source systems, its own consumers, and its own quality standards. Domain ownership for these products is natural and sustainable.

Data mesh struggles when the domains are interdependent. Cross-domain analytics, shared entity resolution, and enterprise-wide reporting do not fit neatly into domain ownership. These capabilities require either a central team or a collaborative model that the federated structure does not naturally support.

The practical compromise: a thin central team that focuses exclusively on cross-domain data products and governance tooling. This team is small — four people, compared to the eighteen-person central team before data mesh. Their mandate is narrow: build and maintain the cross-domain data products that no single domain can own, and invest in governance tooling that makes the federated model self-enforcing rather than self-policed.

Results at year two

Data request fulfillment time dropped from six months average to two weeks average for domain-specific requests. Cross-domain requests still take four to eight weeks because they require coordination between domains, but this is still faster than the old centralized queue.

Data quality incidents increased in the first year as domains took ownership and discovered issues that the central team had been silently correcting. By year two, incident rates dropped below pre-mesh levels because the source systems improved. The first-year spike was painful but necessary — it surfaced quality debt that had been hidden by the centralized cleaning pipeline.

Annual data platform cost decreased by fifteen percent, primarily because the central team was smaller and the domains absorbed data engineering costs into their own budgets. Total organizational spend on data increased by eight percent, reflecting the investment in domain data teams. Net effect: the organization spent slightly more on data but delivered significantly more data products.

The decision heuristic

Adopt data mesh when your bottleneck is the central team’s capacity to serve domain-specific requests, and when your domains have genuinely independent data sources and consumers. Do not adopt data mesh when your primary analytics workload is cross-domain, when your domains lack data engineering skills, or when your governance tooling cannot enforce quality standards computationally. Data mesh trades central coordination costs for distributed coordination costs. If your organization cannot absorb the distributed coordination overhead, the central team may actually be the more efficient model, despite the queue.

Shipping a production AI system?

Find the control gaps before they turn into incidents. Take the AI Production Scorecard for a fast baseline across the seven layers, or book an architecture review and we will turn it into a hardening plan.

Take the AI Production Scorecard Book an Architecture Review

This comment section requires JavaScript.

Enable JavaScript in your browser to use this feature.

Similar Articles

Data Architecture AI Infrastructure

The Modern Data Stack for AI Readiness: Architecture and Implementation

28 Jan, 2025 | 03 Mins read

Existing data infrastructure often cannot support ML workflows. The modern data stack offers a foundation, but it requires adaptation to become AI-ready. This article covers building a data architectu

Case Study AI Infrastructure

How a retailer reduced inference latency 90% with feature store caching

21 Apr, 2026 | 04 Mins read

A mid-market e-commerce retailer with roughly $200M in annual revenue had invested eighteen months building a product recommendation engine. The models were accurate. Offline evaluation showed meaning

Case Study Data Architecture

The data pipeline that cost $50K/month — and the audit that found why

22 Apr, 2026 | 04 Mins read

A financial services firm running analytics on trade settlement data came to us with a specific complaint: their cloud data platform cost had tripled in eighteen months, and nobody could explain why.

Tooling Data Architecture

dbt vs SQLMesh: which transformation tool wins in 2026?

23 Apr, 2026 | 06 Mins read

Every analytics team eventually faces the same choice: how do you transform raw data into something analysts can actually use? For years, dbt was the only serious answer. SQLMesh arrived with a differ

Case Study Data Architecture

Migrating from batch to streaming: a 6-month journey

28 Apr, 2026 | 05 Mins read

A logistics company processing two million shipments per day ran their entire operational reporting stack on nightly batch ETL. Every morning at 6 AM, operations managers reviewed dashboards built on

Data Security Data Architecture

Data Lakehouse Security Best Practices

22 Feb, 2024 | 02 Mins read

Data lakehouses combine lake flexibility with warehouse performance but introduce security challenges from their hybrid nature. Securing these environments requires layered approaches covering authent

Case Study Knowledge Layer

When RAG failed: a knowledge retrieval project post-mortem

29 Apr, 2026 | 05 Mins read

A legal technology company had invested six months building a retrieval-augmented generation system to help contract attorneys find relevant precedent clauses across a corpus of 180,000 executed agree

Tooling Data Architecture

Orchestration face-off: Airflow vs Prefect vs Dagster

07 May, 2026 | 06 Mins read

The orchestration market has a clear incumbent and two serious challengers. Apache Airflow has been the default choice since 2015. Prefect and Dagster both emerged to address Airflow's pain points, bu

Case Study Data Architecture

From 3-hour dashboards to 3-minute insights: a BI modernization story

05 May, 2026 | 05 Mins read

A manufacturing company with facilities in twelve countries ran its operational reporting on a traditional BI stack: a data warehouse, an ETL pipeline, and a dashboard tool that had been deployed six

Case Study AI Infrastructure

The vector database that couldn't scale — and what we did instead

12 May, 2026 | 05 Mins read

A media company with a library of twelve million articles, transcripts, and research documents had built a semantic search system on a managed vector database. The system was designed to let journalis

Case Study AI Infrastructure

Building an AI operating system for a 10,000-person company

19 May, 2026 | 05 Mins read

A diversified industrial company with 10,000 employees across manufacturing, logistics, and field services had accumulated forty-seven separate AI projects over three years. Each business unit had bui

Tooling Data Architecture

Real-time streaming: Kafka vs Redpanda vs Pulsar

21 May, 2026 | 05 Mins read

Kafka has dominated event streaming for a decade. It processes trillions of messages daily across thousands of companies. Its dominance created an ecosystem so large that "streaming" became synonymous

Case Study Data Architecture

How we killed our ETL pipeline (and productivity went up)

26 May, 2026 | 05 Mins read

A B2B SaaS company running a customer success platform had a data pipeline that consumed sixty percent of the data engineering team's time. Not feature work. Not analytics. Pipeline maintenance. The p

Case Study AI Governance

A compliance-first AI rollout in financial services

03 Jun, 2026 | 05 Mins read

A regional bank with $12 billion in assets wanted to use machine learning to improve its commercial loan underwriting process. The existing process was manual, relying on credit analysts who spent fou

Data Architecture Business Intelligence

Semantic Layer Implementation: Challenges and Solutions

20 Mar, 2024 | 02 Mins read

A semantic layer provides business-friendly abstraction over technical data structures, enabling self-service analytics and consistent metric interpretation. Implementing one involves technical challe

Case Study MLOps

The $2M model that never made it to production

09 Jun, 2026 | 05 Mins read

A retail chain with 400 stores spent two years and $2.1 million building an inventory optimization model. The model was technically excellent. It reduced predicted stockouts by thirty-two percent and

Tooling Data Architecture

Data cataloging tools: Atlan, Alation, DataHub, Amundsen

11 Jun, 2026 | 05 Mins read

A data catalog solves a trust problem. When an analyst cannot find the right table, does not know what a column means, or cannot tell whether data is fresh, they either guess or ask someone. Both outc

Case Study AI Infrastructure

When your AI vendor goes bankrupt — surviving platform lock-in

23 Jun, 2026 | 05 Mins read

A healthcare analytics company received notice on a Tuesday afternoon that their primary AI infrastructure vendor was filing for Chapter 7 bankruptcy. The platform hosted their patient risk stratifica

Case Study AI Infrastructure

Real-time fraud detection: from proof-of-concept to production in 90 days

30 Jun, 2026 | 05 Mins read

A payment processor handling twelve million transactions per day had a fraud detection system that was accurate but slow. The system reviewed transactions in batch, four times per day. A fraudulent tr

Case Study Knowledge Layer

Consolidating 47 data sources into one knowledge layer

01 Jul, 2026 | 05 Mins read

A global professional services firm with 8,000 consultants maintained institutional knowledge across forty-seven separate systems. Project proposals lived in a document management system. Client engag

Case Study AI Governance

The GDPR audit that reshaped our entire ML pipeline

07 Jul, 2026 | 05 Mins read

A European fintech with twelve million customers received a GDPR audit notice from their national data protection authority. The audit focused on the company's machine learning pipeline, which powered

Case Study AI Governance

How a healthcare org deployed LLMs without violating HIPAA

14 Jul, 2026 | 05 Mins read

A hospital system with twelve facilities and 14,000 clinical staff wanted to use large language models to assist with clinical documentation. Physicians spent an average of two hours per day on docume

Tooling Data Architecture

Data quality platforms: Great Expectations vs Soda vs Monte Carlo

15 Jul, 2026 | 06 Mins read

Data quality failures are expensive and silent. A broken pipeline does not crash — it produces wrong data that flows into dashboards, models, and decisions. The error is discovered weeks later when a

Serverless Data Architecture

Serverless Data Pipelines: Architecture Patterns

05 Jun, 2024 | 08 Mins read

# Serverless Data Pipelines: Architecture Patterns Serverless computing eliminates server management and provides automatic scaling with pay-per-use billing. These benefits matter for data pipelines

Data Architecture Event Processing

Event-Driven Data Architecture

15 Sep, 2024 | 02 Mins read

Event-driven architectures treat changes in state as events that trigger immediate actions and data flows. Rather than processing data in batches or through scheduled jobs, components react to changes

Data Architecture Enterprise AI

From Data Silos to Data Mesh: The Evolution of Enterprise Data Architecture

15 Feb, 2025 | 03 Mins read

Traditional centralized data architectures worked for BI but struggle with AI workloads. Centralized teams become bottlenecks as data volumes grow. Domain experts who understand the data are separated

Case Study RAG

Case Study: End-to-End RAG Platform for Customer Support

05 Dec, 2025 | 05 Mins read

A SaaS company with 200 support agents and 10,000+ knowledge base articles had an 18-hour average response time and 23% first-contact resolution. Their largest enterprise client threatened to cancel a

Knowledge Layer Case Study

Case Study: Building a Production AI Knowledge Layer for Financial Services

01 Mar, 2026 | 10 Mins read

A regional bank's investment research team spent 60% of their time gathering information and 40% doing analysis. Analysts had to search through regulatory filings, internal research memos, market data

AI Infrastructure Data Architecture

Feature Stores for AI: The Missing MLOps Component Reaching Maturity

12 Mar, 2026 | 11 Mins read

A recommendation system team built their tenth model. Each model required feature engineering. Each feature engineering project started by copying code from the previous project, then modifying it for

Agent Orchestration Case Study

Case Study: Multi-Agent System for Supply Chain Optimization

13 Jun, 2026 | 12 Mins read

A mid-size automotive parts manufacturer with operations spanning 15 countries and relationships with over 200 suppliers faced a supply chain coordination problem that was consuming too much of their

Data Architecture AI Infrastructure

The AI Data Pipeline: Special Considerations for Unstructured and Structured Data

11 May, 2026 | 13 Mins read

Data pipelines for AI are not the same as data pipelines for traditional software systems. The outputs are different. The failure modes are different. The tolerance for data quality issues is differen