L4 vs L7 Routing: Layer 4 and Layer 7 Load Balancing Explained
Understanding the difference between Layer 4 (transport) and Layer 7 (application) routing is fundamental to designing efficient cloud architectures. This guide explains when to use each, their performance characteristics, and implementation details.
OSI Model Quick Refresher
The OSI (Open Systems Interconnection) model divides network communication into seven layers. For load balancing discussions, two layers are critical:
- Layer 4 (Transport): TCP and UDP. Load balancing decisions are based on IP addresses and port numbers.
- Layer 7 (Application): HTTP, HTTPS, gRPC, WebSocket. Load balancing can inspect application-level data like URLs, headers, and cookies.
Each layer offers different capabilities for traffic management, with important trade-offs in performance, flexibility, and complexity.
Layer 4 Load Balancing
How L4 Load Balancing Works
Layer 4 load balancers operate at the TCP/UDP level. They make routing decisions based on:
- Source IP address
- Destination IP address
- Source port
- Destination port
- Protocol (TCP/UDP)
The load balancer doesn't inspect packet contents beyond Layer 4 headers. This makes it extremely efficient—decisions are made on the first packet of a connection, and subsequent packets follow the same path.
L4 Routing Modes
NAT Mode (Network Address Translation)
The load balancer rewrites packet headers to direct traffic to backends:
- Client connects to VIP (Virtual IP) on the load balancer
- Load balancer changes destination IP to backend server
- Response returns through load balancer, which rewrites source back to VIP
Pro: Works with any backend configuration. Con: Load balancer is in the data path for all traffic.
Direct Server Return (DSR)
The load balancer only handles inbound traffic; responses go directly to clients:
- Load balancer forwards packets to backends without NAT
- Backends must be configured to accept traffic for the VIP
- Response packets go directly to client, bypassing load balancer
Pro: Massive throughput (load balancer doesn't handle response traffic). Con: Complex backend configuration.
L4 Cloud Implementations
- AWS: Network Load Balancer (NLB)
- GCP: Network Load Balancing (TCP/UDP)
- Azure: Azure Load Balancer
- On-premise: HAProxy (TCP mode), LVS, F5 BIG-IP
Performance Characteristics
L4 load balancers typically offer:
- Millions of connections per second: Simple packet processing enables massive scale
- Sub-millisecond latency: No application parsing overhead
- Line-rate throughput: Often limited only by network bandwidth
Layer 7 Load Balancing
How L7 Load Balancing Works
Layer 7 load balancers terminate the TCP connection from clients and create new connections to backends. They can inspect application-level data:
- HTTP path: Route /api/* to API servers, /static/* to CDN
- HTTP headers: Route based on Host header, Accept-Language, User-Agent
- Cookies: Ensure session affinity (sticky sessions)
- HTTP method: Route GETs to read replicas, POSTs to primary
- Request body: Advanced routing based on payload content
L7 Routing Capabilities
Path-Based Routing
# Route configuration example
/api/v1/* -> api-service-v1
/api/v2/* -> api-service-v2
/static/* -> cdn-origin
/health -> health-service
/* -> frontend-service
Header-Based Routing
# Route by Host header (virtual hosting)
api.example.com -> api-backends
www.example.com -> web-backends
admin.example.com -> admin-backends
# Route by custom header (canary deployments)
X-Canary: true -> canary-backends (10% traffic)
* -> stable-backends (90% traffic)
Content Transformation
L7 load balancers can modify requests and responses:
- Add/remove headers (X-Request-ID, security headers)
- URL rewriting
- Response compression
- Rate limiting per endpoint
L7 Cloud Implementations
- AWS: Application Load Balancer (ALB), API Gateway
- GCP: HTTP(S) Load Balancing, Cloud Run
- Azure: Application Gateway, Front Door
- On-premise/Kubernetes: NGINX, Envoy, HAProxy, Traefik
Performance Characteristics
L7 load balancers have different performance profiles:
- Lower connection capacity: TCP termination consumes memory per connection
- Higher latency: 1-5ms typical for processing (TLS termination, header parsing)
- CPU-intensive: TLS handshakes and HTTP parsing require significant CPU
Detailed Comparison
| Aspect | Layer 4 | Layer 7 |
|---|---|---|
| Decision Data | IP + Port only | Full HTTP request |
| Protocol Support | Any TCP/UDP | HTTP, HTTPS, gRPC, WebSocket |
| TLS Termination | Pass-through or terminate | Always terminates (inspects content) |
| Connection Handling | Forwards packets | Proxy (two connections) |
| Latency Added | Microseconds | Milliseconds |
| Throughput | Very high (line rate) | Limited by CPU |
| Session Stickiness | IP-based | Cookie-based (more reliable) |
| Health Checks | TCP connect, port check | HTTP status, response content |
| WebSocket Support | Native (just TCP) | Requires specific support |
| Cost (Cloud) | Lower | Higher (more processing) |
When to Use Layer 4
L4 load balancing is the right choice for:
Non-HTTP Protocols
- Database connections: MySQL, PostgreSQL, Redis
- Message queues: Kafka, RabbitMQ
- Custom TCP protocols: Game servers, IoT
- SMTP, IMAP: Email servers
Maximum Performance
- High-frequency trading: Every microsecond matters
- Massive connection counts: Millions of concurrent connections
- Bandwidth-intensive: Video streaming, large file transfers
TLS Passthrough
When backends must terminate TLS themselves (for mTLS, certificate pinning, or compliance), L4 passthrough preserves end-to-end encryption.
When to Use Layer 7
L7 load balancing is essential for:
Microservices Routing
Route requests to different services based on URL path:
- /users/* to user-service
- /orders/* to order-service
- /search/* to search-service
Canary Deployments
Gradually shift traffic to new versions based on headers or percentage:
# 5% of traffic to new version
backends:
- name: v1
weight: 95
- name: v2
weight: 5
Authentication Offload
Validate JWTs, API keys, or OAuth tokens before traffic reaches backends, reducing backend complexity.
Rate Limiting
Apply rate limits per endpoint, user, or API key at the edge.
Header Manipulation
Add security headers (HSTS, CSP), request IDs, or client identification headers.
Combining L4 and L7
Production architectures often combine both layers:
Internet
│
▼
┌─────────────────┐
│ L4 (NLB) │ ← Global entry point, DDoS protection
│ TCP/443 │
└────────┬────────┘
│
▼
┌─────────────────┐
│ L7 (ALB) │ ← HTTP routing, TLS termination
│ HTTPS │
└────────┬────────┘
│
┌────┴────┐
▼ ▼
┌───────┐ ┌───────┐
│ Svc A │ │ Svc B │
└───────┘ └───────┘
This pattern uses:
- L4 for ingress: Handles DDoS at network layer, passes TLS through
- L7 for routing: Path-based routing, authentication, rate limiting
For more on global architectures, see our guide on global load balancing.
Real-World Example: E-Commerce Platform
Consider an e-commerce platform with these requirements:
- Web frontend (HTTP)
- API backend (HTTPS/gRPC)
- Redis cache (TCP 6379)
- PostgreSQL database (TCP 5432)
Architecture
- ALB (L7): Routes /api/* to API service, /* to web frontend
- NLB (L4): Frontend for Redis cluster (client-side routing handled separately)
- NLB (L4): PostgreSQL read replica load balancing
Key Takeaways
- L4: Fast, simple, works with any TCP/UDP protocol
- L7: Flexible routing based on HTTP content, higher overhead
- Use L4 for databases, message queues, and high-performance scenarios
- Use L7 for HTTP routing, authentication, and microservices
- Combine both layers for defense in depth and optimal performance
Need Help Designing Your Load Balancing Architecture?
Our engineers can help you choose the right approach for your workloads. Get in touch for a consultation.