Orientation
What You'll Master Here
Every model, mart, feature, and data product in this course depends on upstream data behaving as expected. The moment an upstream team renames a column or changes a meaning, everything downstream breaks, silently. This chapter is about making those dependencies explicit and safe to change: data contracts, schema evolution, and governance.
You will learn what a data contract really is (far more than a schema), the producer-consumer relationship it formalizes, how to evolve schemas without breaking consumers (backward/forward/full compatibility), the expand-and-contract pattern for safe migrations, and how contracts are enforced with lineage and quality governance.
This is shown with concrete contract specs and schema diffs, because the difference between a breaking change and a safe one is precise, and getting it wrong is how a "small" upstream edit takes down a dozen dashboards at once.
Why it matters
Undeclared data dependencies are the silent killer of data platforms: an upstream change breaks downstream with no warning. Contracts and disciplined evolution turn fragile implicit dependencies into explicit, safe ones.
Core mental model
A data contract is an API for data: a producer’s explicit promise to consumers, and schema evolution is how you change that API without breaking callers.
- data contract
- A producer’s explicit promise about schema, semantics, quality, and ownership.
- schema evolution
- Changing a schema over time without breaking existing consumers.
- compatibility
- Whether a schema change still works for old/new readers (backward/forward/full).
- lineage
- The traced flow of data from sources through transformations to outputs.
Common mistake
Treating upstream tables as stable just because they exist.
A silent upstream rename or type change breaks downstream models with no error at the boundary.
Better habit
- Make data dependencies explicit with contracts.
- Evolve schemas in compatible ways, or version.
- Enforce contracts and track lineage automatically.
Treat data like an API. A contract makes the producer accountable and the consumer safe; schema evolution is API versioning for data.
Watch one change, renaming full_name to name, and see how a contract, compatibility rules, and expand-and-contract turn a breaking change into a safe one.
Practice prompts
- Name a downstream consumer that depends on a table you produce.
- Explain why "the table exists" is not the same as "the table is stable".
Remember this
Data contracts make producer-consumer dependencies explicit, and schema evolution lets data change without breaking consumers, treat data like a versioned API.
