Read replicas & write scaling

A primary (master) accepts writes and replicates them to read replicas. Reads can be served from replicas to scale read capacity; writes go to the primary. Write scaling usually means sharding the write path (multiple primaries, each owning a shard) since a single primary has a write limit.

Primary + read replicas

flowchart TB W[Writes] --> P[(Primary)] P --> |Replication| R1[(Replica 1)] P --> |Replication| R2[(Replica 2)] R1[Replica 1] --> Read1[Reads] R2[Replica 2] --> Read2[Reads]

Replication lag

Replicas are eventually consistent: there is a short delay (lag) between write on primary and visibility on replica. Read-after-write consistency requires reading from primary or waiting for replica to catch up.

sequenceDiagram participant C as Client participant P as Primary participant R as Replica C->>P: Write P-->>C: OK C->>R: Read (may not see write yet - lag) Note over R: Replication lag: ms to seconds

Goal	Approach
Scale reads	Add read replicas; route SELECT to replicas
Scale writes	Shard (multiple primaries); each shard takes a subset of writes
High availability	Failover: promote replica to primary if primary fails

Use read replicas when read load is high and slight staleness is acceptable. Use sharding (multiple primaries) when write load exceeds one primary’s capacity.