A load balancer distributes incoming requests across multiple servers (replicas) so no single node is overloaded. The algorithm decides which backend gets each request: round-robin, least connections, or hash-based (e.g. by client IP or session ID) for stickiness.
| Algorithm | How it works | Use case |
|---|---|---|
| Round-robin | Rotate: S1, S2, S3, S1, S2, … | Stateless; even distribution |
| Least connections | Send to server with fewest active connections | Long-lived or variable cost requests |
| Hash (e.g. IP or cookie) | Same key → same server | Sticky session, local cache |
| Weighted | Assign more traffic to stronger nodes | Mixed hardware or capacity |
Use round-robin when backends are stateless and equal. Use least connections when request cost or duration varies. Use hashing when you need the same client/session to hit the same server (and consider consistent hashing for adding/removing nodes).