DATA ARCHITECTURE

Data Contracts, Schema Evolution & Governance

Chapter 21Specialized & AppliedGovernance

Orientation

What You'll Master Here

Every model, mart, feature, and data product in this course depends on upstream data behaving as expected. The moment an upstream team renames a column or changes a meaning, everything downstream breaks, silently. This chapter is about making those dependencies explicit and safe to change: data contracts, schema evolution, and governance.

You will learn what a data contract really is (far more than a schema), the producer-consumer relationship it formalizes, how to evolve schemas without breaking consumers (backward/forward/full compatibility), the expand-and-contract pattern for safe migrations, and how contracts are enforced with lineage and quality governance.

This is shown with concrete contract specs and schema diffs, because the difference between a breaking change and a safe one is precise, and getting it wrong is how a "small" upstream edit takes down a dozen dashboards at once.

Why it matters

Undeclared data dependencies are the silent killer of data platforms: an upstream change breaks downstream with no warning. Contracts and disciplined evolution turn fragile implicit dependencies into explicit, safe ones.

Core mental model

A data contract is an API for data: a producer’s explicit promise to consumers, and schema evolution is how you change that API without breaking callers.

Key terms
data contract
A producer’s explicit promise about schema, semantics, quality, and ownership.
schema evolution
Changing a schema over time without breaking existing consumers.
compatibility
Whether a schema change still works for old/new readers (backward/forward/full).
lineage
The traced flow of data from sources through transformations to outputs.

Common mistake

Treating upstream tables as stable just because they exist.

A silent upstream rename or type change breaks downstream models with no error at the boundary.

Better habit

  • Make data dependencies explicit with contracts.
  • Evolve schemas in compatible ways, or version.
  • Enforce contracts and track lineage automatically.
The big idea

Treat data like an API. A contract makes the producer accountable and the consumer safe; schema evolution is API versioning for data.

How to study this chapter

Watch one change, renaming full_name to name, and see how a contract, compatibility rules, and expand-and-contract turn a breaking change into a safe one.

Practice prompts

  • Name a downstream consumer that depends on a table you produce.
  • Explain why "the table exists" is not the same as "the table is stable".

Remember this

Data contracts make producer-consumer dependencies explicit, and schema evolution lets data change without breaking consumers, treat data like a versioned API.