A primary (master) accepts writes and replicates them to read replicas. Reads can be served from replicas to scale read capacity; writes go to the primary. Write scaling usually means sharding the write path (multiple primaries, each owning a shard) since a single primary has a write limit.
Replicas are eventually consistent: there is a short delay (lag) between write on primary and visibility on replica. Read-after-write consistency requires reading from primary or waiting for replica to catch up.
| Goal | Approach |
|---|---|
| Scale reads | Add read replicas; route SELECT to replicas |
| Scale writes | Shard (multiple primaries); each shard takes a subset of writes |
| High availability | Failover: promote replica to primary if primary fails |
Use read replicas when read load is high and slight staleness is acceptable. Use sharding (multiple primaries) when write load exceeds one primary’s capacity.