Conference report: key takeaways from Data Council 2026

Simor Consulting | 23 May, 2026 | 04 Mins read

Data Council 2026 wrapped in Austin last week, and the signal-to-noise ratio was higher than in recent years. The conference has historically been the venue where data infrastructure practitioners — not vendors, not analysts — discuss what actually works in production. This year, three themes dominated the talks, hallway conversations, and unconference sessions.

Theme One: The Metadata Platform Is Dead

The most provocative claim came from a joint talk by two platform leads from mid-stage startups: the metadata platform as a product category is finished. Not because metadata does not matter, but because metadata has been absorbed into the data platforms themselves.

Three years ago, the data catalog was a separate product. You bought Alation, Atlan, or DataHub to manage your metadata. The argument at Data Council 2026 is that this category is collapsing because Snowflake, Databricks, and BigQuery now provide built-in metadata management, data lineage, and data discovery. The standalone metadata platform only made sense when the data warehouse did not track its own metadata.

The counterargument, voiced in the unconference, is that cross-platform metadata still requires a dedicated tool. If your data lives in Snowflake and S3 and a PostgreSQL instance and a Kafka cluster, no single platform’s built-in metadata covers your full data landscape. The reality is probably a middle ground: built-in metadata for single-platform shops, cross-platform tools for heterogeneous environments, and the standalone catalog product as a shrinking market.

The practitioner takeaway: if you are evaluating a metadata platform, ask whether the problem is truly cross-platform or whether your primary data warehouse has added the capabilities you need since you last evaluated.

Theme Two: Data Contracts Are Finally Getting Teeth

Data contracts have been discussed at every data conference for three years. The consistent complaint has been that contracts are a good idea with no enforcement mechanism. At Data Council 2026, three teams presented production implementations where data contracts are enforced at the schema level, with automated breaking-change detection and producer-side CI gates.

The pattern that works: define the contract as a schema with explicit quality guarantees (freshness, completeness, uniqueness). Store the contract in version control alongside the producing service. Add a CI check that compares the contract against the actual output schema. When a producer changes its output in a way that violates the contract, the CI check fails and the change cannot be merged.

This is not a new idea. What is new is that the tooling has matured to the point where implementation is practical. Schema registries, data observability platforms, and pipeline frameworks now provide the hooks needed for contract enforcement. The teams that presented had each built their enforcement in under a month.

The practitioner takeaway: if you have been waiting for data contracts to mature, the tooling is ready. Start with your highest-value data products, define contracts as schemas with quality SLAs, and enforce them in CI.

Theme Three: The AI Data Pipeline Is Not Your ETL Pipeline

The third dominant theme was the distinction between traditional ETL pipelines and AI data pipelines. The argument, made most clearly by a principal engineer from a large fintech, is that teams are making a category error when they try to run AI workloads on their existing data infrastructure.

Traditional ETL pipelines move structured data from source to destination on a schedule. AI data pipelines manage unstructured data, embedding generation, vector index updates, model training data curation, and evaluation dataset maintenance. The throughput, latency, and quality requirements are different. The failure modes are different. The monitoring requirements are different.

Several speakers argued that the “AI data pipeline” should be treated as a separate infrastructure concern with its own tooling, monitoring, and ownership. Trying to force AI data flows through Airflow DAGs designed for batch SQL transformations creates fragility that does not appear until the system is under load.

The practitioner takeaway: if you are building RAG pipelines, fine-tuning workflows, or agent data flows, evaluate whether your existing pipeline infrastructure is the right tool. In many cases, purpose-built AI data pipeline tools — vector database change feeds, embedding pipelines, evaluation harnesses — are more appropriate than extending your ETL platform.

The Unconference Signal

The unconference sessions surfaced two concerns that did not make it into formal talks but dominated informal discussion.

First, the cost of running AI workloads is creating budget pressure that is forcing data teams to make difficult prioritization decisions. Several teams reported that their AI infrastructure costs exceeded their traditional data warehouse costs for the first time in Q1 2026, and that leadership was questioning the ROI of AI investments that had been approved with optimistic projections.

Second, the talent market for data engineers with AI infrastructure skills is extremely tight. Teams are struggling to hire engineers who understand both traditional data systems and the new AI stack. The gap is not in ML engineering (building models) but in data engineering for AI (building the infrastructure that models run on).

Bounded Recommendation

The most actionable signal from Data Council 2026 is that data contracts are ready for production adoption. If you have been considering contracts, stop considering and start implementing. The second actionable signal is that AI data pipelines need dedicated infrastructure and ownership. If your AI data flows are running on your ETL platform as a temporary measure, the temporary measure has expired.

Shipping a production AI system?

Find the control gaps before they turn into incidents. Take the AI Production Scorecard for a fast baseline across the seven layers, or book an architecture review and we will turn it into a hardening plan.

Take the AI Production Scorecard Book an Architecture Review

This comment section requires JavaScript.

Enable JavaScript in your browser to use this feature.

Similar Articles

Data Engineering Operations

Legacy Data Pipeline Modernization Without Rewriting Everything

10 Jul, 2026 | 07 Mins read

The pipeline runs every night at 2 a.m. Nobody fully understands it. The original author left in 2019. It is part SAS, part shell, part stored procedures, and part a spreadsheet someone emails in. It

Data Engineering AI Infrastructure

Building AI-Ready Data Pipelines: Key Architecture Considerations

04 Mar, 2025 | 02 Mins read

Data pipelines built for business intelligence often fail when supporting AI workloads. The root cause is usually architectural: BI pipelines assume bounded, relatively static datasets, while AI syste

Trends AI Governance

EU AI Act enforcement begins: what data teams must do now

25 Apr, 2026 | 04 Mins read

The first enforcement window of the EU AI Act opened in February 2026, and the grace periods that protected early movers are expiring on a rolling schedule through 2027. This is no longer a policy dis

Trends AI Infrastructure

The open-source LLM landscape just shifted — again

02 May, 2026 | 03 Mins read

Three releases in the last six weeks have redrawn the open-source LLM map. Meta shipped Llama 4 with a mixture-of-experts architecture that narrows the gap with proprietary frontier models. Mistral re

Trends AI Infrastructure

Why every cloud provider launched an AI operating system this year

09 May, 2026 | 03 Mins read

AWS announced Bedrock Studio. Google shipped Vertex AI Platform as a unified surface. Azure consolidated its AI offerings under a single "AI Foundry" brand. Databricks, Snowflake, and even Cloudflare

Trends AI Infrastructure

The A2A protocol and what it means for enterprise AI

16 May, 2026 | 03 Mins read

Google published the Agent-to-Agent (A2A) protocol specification in late 2025 and, as of this quarter, has secured endorsement from over fifty technology companies including Salesforce, SAP, ServiceNo

Data Engineering Operations

The data quality scorecard: metrics that actually matter

17 May, 2026 | 06 Mins read

Most data quality initiatives fail not because teams lack tools, but because they measure the wrong things. Teams track hundreds of data quality metrics, generate dashboards full of green indicators,

Trends AI Infrastructure

AI spending is up 300% — where is it actually going?

27 May, 2026 | 03 Mins read

Enterprise AI spending increased roughly 300% year-over-year according to multiple industry surveys released this quarter. The headline number gets attention, but the breakdown is where the actionable

Trends Thought Leadership

The great model commoditization: what happens when everyone has GPT-5

30 May, 2026 | 03 Mins read

OpenAI shipped GPT-5. Anthropic shipped Claude 4. Google shipped Gemini Ultra 2. Within six weeks of each other, the three leading model providers released frontier models that are, by most benchmarks

Data Engineering Operations

Migration playbook: batch to streaming in 5 phases

31 May, 2026 | 06 Mins read

The case for streaming is straightforward: data that arrives in minutes instead of hours enables decisions that were previously impossible. Fraud detection catches transactions before they clear. Pers

Trends AI Governance

Regulators are coming for your training data — are you ready?

06 Jun, 2026 | 03 Mins read

The regulatory focus on AI is narrowing from the models themselves to the data that trains them. The EU AI Act requires documentation of training data provenance and composition. The US Copyright Offi

Trends Thought Leadership

Why 'AI engineer' is the fastest-growing job title (and what it means)

17 Jun, 2026 | 04 Mins read

LinkedIn's latest workforce report shows "AI engineer" as the fastest-growing job title for the third consecutive quarter. Job postings containing the title increased 280% year-over-year. The growth r

Data Engineering Forecasting

Data Pipelines for Time Series Forecasting

21 Mar, 2024 | 02 Mins read

Time series forecasting requires specialized pipeline architecture. Unlike standard batch processing, time series work demands strict chronological ordering, historical context, time-based feature eng

Trends Data Engineering

The death of the dashboard: what replaces BI?

20 Jun, 2026 | 03 Mins read

The traditional BI dashboard — a grid of charts that a business user opens every morning to check KPIs — is losing its grip on how organizations consume data. The decline is not dramatic. No one decla

Trends AI Governance

Sovereign AI: why countries are building their own models

27 Jun, 2026 | 03 Mins read

France released a fully open-source large language model trained on curated French-language data. India announced a multilingual model covering 22 scheduled languages. The UAE expanded its Falcon mode

Trends AI Infrastructure

The hidden environmental cost of your RAG pipeline

04 Jul, 2026 | 03 Mins read

Retrieval-augmented generation is the default architecture for enterprise AI applications that need to ground model outputs in organizational data. The standard RAG pipeline ingests documents, chunks

Trends Data Engineering

Why your AI strategy needs a data strategy (not the other way around)

11 Jul, 2026 | 03 Mins read

The majority of enterprise AI strategies are built on an implicit assumption: that the organization's data is ready to support AI workloads. The assumption is almost always wrong. Data that is adequat

Data Governance Data Engineering

Data Contracts: Building Trust Between Teams

29 Jan, 2024 | 03 Mins read

Data contracts are formal agreements that define the structure, semantics, quality standards, and delivery expectations for data exchanged between teams. They specify schema definitions, SLAs, ownersh

Trends AI Infrastructure

Agentic AI in production: hype vs reality check

18 Jul, 2026 | 03 Mins read

Agentic AI — systems where language models plan, execute multi-step tasks, and use tools autonomously — is the dominant topic at every AI conference, vendor pitch, and engineering blog. The hype is in

Trends AI Infrastructure

The $100B AI infrastructure buildout — who benefits?

25 Jul, 2026 | 03 Mins read

The combined AI infrastructure capital expenditure of the four largest cloud providers exceeded $100 billion in the trailing twelve months. Microsoft, Google, Amazon, and Meta are building data center

Data Engineering Synthetic Data

Building Synthetic Data Pipelines for ML Testing

24 May, 2024 | 04 Mins read

# Building Synthetic Data Pipelines for ML Testing Synthetic data addresses real ML development problems: privacy restrictions on real data, class imbalance, and edge case coverage. It does not repla

Machine Learning Data Engineering Feature Engineering

Feature Store Architectures: Building the Foundation for Enterprise ML

18 Jan, 2024 | 03 Mins read

Organizations scaling ML efforts encounter a predictable problem: feature engineering work duplicates across teams, training-serving skew causes model failures in production, and point-in-time correct

Data Engineering Temporal Data

Time-Travel Queries: Implementing Temporal Data Access

02 Oct, 2024 | 03 Mins read

Time-travel queries—the ability to access data as it existed at any point in the past—have become essential in modern data platforms. This capability transforms how organizations approach data governa

Trends Thought Leadership

2025 Year-in-Review & 2026 Trends in Data & AI Architecture

19 Dec, 2025 | 03 Mins read

2025 was the year AI moved from experimentation to industrialization. While 2024 saw the explosion of generative AI capabilities, 2025 was about making those capabilities production-ready, cost-effect

AI Infrastructure Trends

RAG vs Fine-Tuning: Choosing the Right Approach for Your Use Case

10 Jul, 2026 | 08 Mins read

Your team has a real use case. Maybe it is a support assistant that answers from your knowledge base, a contracts reviewer that applies your house clause library, or an ops copilot that understands yo

AI Infrastructure Data Engineering

Choosing a Vector Database for Production AI Applications

10 Jul, 2026 | 12 Mins read

You have a retrieval-augmented generation proof of concept that works on a laptop. The embeddings are in a CSV file, the search is brute force, and the demo impresses the steering committee. Now someo

Trends AI Enablement

Why Small Businesses Need AI Now: A 2026 Practitioner's Guide

10 Jul, 2026 | 11 Mins read

If you run a small business, you have heard the AI pitch a hundred times. Most of it is aimed at enterprises with data teams, seven-figure budgets, and a CIO to translate. That framing is now out of d