Programming · Servers · Networking
What is a load balancer?
One server can only handle so much. When traffic grows past what a single machine can serve — or when you simply cannot afford for the site to go down if that machine fails — you run several copies and put a load balancer in front of them. This guide explains what a load balancer is, how it decides where each request goes, and how it relates to the reverse proxy you may already know.
The core idea: share the work
A load balancer is a component that receives incoming requests and distributes them across a pool of backend servers — usually identical copies of the same app. To the visitor there is one address; behind it, the load balancer quietly spreads the traffic so that no single server is overwhelmed. It solves two problems at once: scale (handle more traffic than one machine could) and availability (if one server dies, the others keep serving).
That second point is the one people underrate. A load balancer continuously checks whether each backend is healthy and stops sending traffic to any that stop responding. A single bad server no longer takes the whole site down — its share of requests is simply routed to the healthy ones.
How it decides: balancing algorithms
The load balancer needs a rule for picking which server handles the next request. The common ones are:
- Round-robin — hand each new request to the next server in turn. Simple and even when servers are similar.
- Least connections — send the request to whichever server currently has the fewest active connections. Better when requests vary in cost.
- IP hash — pick the server based on the client's IP, so the same visitor keeps landing on the same backend (useful for session stickiness).
- Weighted — give bigger servers a larger share of the traffic.
Layer 4 vs Layer 7
Load balancers work at one of two levels. A Layer 4 (transport) balancer routes by IP address and port without looking inside the traffic — fast and protocol-agnostic. A Layer 7 (application) balancer understands HTTP, so it can route by URL path, hostname or headers, terminate TLS, and make smarter decisions. Layer 7 is the common choice for web apps; Layer 4 wins when raw speed matters more than content-aware routing.
Load balancer vs reverse proxy
This is the question that trips people up. The honest answer: load balancing is one of the jobs a reverse proxy can do. A reverse proxy is the broader idea — a server that fronts your backends and can route, cache, terminate TLS and distribute load. A "load balancer" is that same front door focused on the distribution job. In practice the tools overlap: Nginx, HAProxy and Caddy all act as load balancers, and a cloud load balancer is a managed version of the same role.
Where load balancers run
You meet them in a few forms. Software load balancers — Nginx, HAProxy, Traefik — run on a server you control. Cloud load balancers (AWS, Google Cloud, Azure) are managed services you configure rather than host. And in container platforms like Kubernetes, load balancing is built in, spreading traffic across pods as they come and go. Whichever you use, you will want more than one backend — typically a few servers running identical copies of the app.
The bottom line
A load balancer spreads incoming requests across several servers so the system can handle more traffic and survive a server failure. It picks a backend using an algorithm like round-robin or least-connections, checks each server's health, and can work at the transport (L4) or application (L7) layer. Think of it as the distribution job of a reverse proxy — the piece that turns one app on one machine into a service that scales and stays up.