Database normalization vs denormalization

Normalization structures data into tables with no (or minimal) redundancy; each fact stored once. Denormalization deliberately duplicates or embeds data to speed up reads and simplify queries at the cost of redundancy and update complexity.

Normalized: one place per fact

erDiagram USERS ||--o{ ORDERS : places USERS { int id string name string email } ORDERS { int id int user_id date created } ORDERS ||--o{ ORDER_ITEMS : contains ORDER_ITEMS { int id int order_id int product_id int qty } PRODUCTS { int id string name decimal price } ORDER_ITEMS }o--|| PRODUCTS : references

Denormalized: duplicate for read speed

flowchart LR subgraph Norm["Normalized"] U[users] O[orders] P[products] U --> O --> P end subgraph Denorm["Denormalized (e.g. read model)"] R[orders_with_user_name_and_product_name] end

Normalization	Denormalization
No redundant data; single source of truth	Redundant data; faster reads, fewer joins
Updates in one place	Updates must propagate to all copies
More joins for complex reads	One table or document per read pattern
Good for consistency and write path	Good for read-heavy, analytics, caches

Use normalized schema for the canonical write path. Use denormalization in read replicas, materialized views, or document/NoSQL read models when you need to optimize for specific queries and can keep duplicates in sync (e.g. via events or ETL).