Orientation
What You'll Master Here
Normalization is not an exam ritual. It puts each independently changing fact in one authoritative place so writes cannot quietly contradict each other.
We use a continuous commerce dataset: customers, orders, order items, products, categories, and captured sale prices.
You will see why a wide order-line table breaks, how dependencies reveal the right tables, and when an owned read model earns deliberate duplication.
Why it matters
Duplicated facts eventually drift. A schema should protect truth instead of asking every writer to remember every copy.
Core mental model
One independently changeable fact, one source of truth. Denormalize only an owned read path with refresh and staleness rules.
| form | protects against | commerce example |
|---|---|---|
| 1NF | repeating groups | one order item per row, not item_1/item_2 columns |
| 2NF | partial dependency | product_name does not depend on only product_id in order_items |
| 3NF | transitive dependency | category_name does not depend through category_id |
| BCNF | non-key determinant | every determinant identifies a whole row |
- functional dependency
- X -> Y means X determines one Y in the modeled business domain.
- determinant
- The left side of a dependency, such as product_id in product_id -> product_name.
- anomaly
- A write failure caused by storing independent facts together.
Common mistake
Splitting tables mechanically without checking workload.
Unnecessary joins replace a clear source model.
Better habit
- Write dependency arrows before moving columns.
- Test update, insert, and delete scenarios.
- Name the owner of every duplicated field.
I normalize independently changing facts, then denormalize only a measured read path with an explicit owner and refresh rule.
Practice prompts
- List three independent facts in an order system.
- Name the owner of customer_name in a reporting table.
Remember this
Normalization protects truth; denormalization serves proven reads.
