L4 vs L7 Routing: Layer 4 and Layer 7 Load Balancing Explained

Understanding the difference between Layer 4 (transport) and Layer 7 (application) routing is fundamental to designing efficient cloud architectures. This guide explains when to use each, their performance characteristics, and implementation details.

OSI Model Quick Refresher

The OSI (Open Systems Interconnection) model divides network communication into seven layers. For load balancing discussions, two layers are critical:

Layer 4 (Transport): TCP and UDP. Load balancing decisions are based on IP addresses and port numbers.
Layer 7 (Application): HTTP, HTTPS, gRPC, WebSocket. Load balancing can inspect application-level data like URLs, headers, and cookies.

Each layer offers different capabilities for traffic management, with important trade-offs in performance, flexibility, and complexity.

Layer 4 Load Balancing

How L4 Load Balancing Works

Layer 4 load balancers operate at the TCP/UDP level. They make routing decisions based on:

Source IP address
Destination IP address
Source port
Destination port
Protocol (TCP/UDP)

The load balancer doesn't inspect packet contents beyond Layer 4 headers. This makes it extremely efficient—decisions are made on the first packet of a connection, and subsequent packets follow the same path.

L4 Routing Modes

NAT Mode (Network Address Translation)

The load balancer rewrites packet headers to direct traffic to backends:

Client connects to VIP (Virtual IP) on the load balancer
Load balancer changes destination IP to backend server
Response returns through load balancer, which rewrites source back to VIP

Pro: Works with any backend configuration. Con: Load balancer is in the data path for all traffic.

Direct Server Return (DSR)

The load balancer only handles inbound traffic; responses go directly to clients:

Load balancer forwards packets to backends without NAT
Backends must be configured to accept traffic for the VIP
Response packets go directly to client, bypassing load balancer

Pro: Massive throughput (load balancer doesn't handle response traffic). Con: Complex backend configuration.

L4 Cloud Implementations

AWS: Network Load Balancer (NLB)
GCP: Network Load Balancing (TCP/UDP)
Azure: Azure Load Balancer
On-premise: HAProxy (TCP mode), LVS, F5 BIG-IP

Performance Characteristics

L4 load balancers typically offer:

Millions of connections per second: Simple packet processing enables massive scale
Sub-millisecond latency: No application parsing overhead
Line-rate throughput: Often limited only by network bandwidth

Layer 7 Load Balancing

How L7 Load Balancing Works

Layer 7 load balancers terminate the TCP connection from clients and create new connections to backends. They can inspect application-level data:

HTTP path: Route /api/* to API servers, /static/* to CDN
HTTP headers: Route based on Host header, Accept-Language, User-Agent
Cookies: Ensure session affinity (sticky sessions)
HTTP method: Route GETs to read replicas, POSTs to primary
Request body: Advanced routing based on payload content

L7 Routing Capabilities

Path-Based Routing

# Route configuration example
/api/v1/*       -> api-service-v1
/api/v2/*       -> api-service-v2  
/static/*       -> cdn-origin
/health         -> health-service
/*              -> frontend-service

Header-Based Routing

# Route by Host header (virtual hosting)
api.example.com    -> api-backends
www.example.com    -> web-backends
admin.example.com  -> admin-backends

# Route by custom header (canary deployments)
X-Canary: true     -> canary-backends (10% traffic)
*                  -> stable-backends (90% traffic)

Content Transformation

L7 load balancers can modify requests and responses:

Add/remove headers (X-Request-ID, security headers)
URL rewriting
Response compression
Rate limiting per endpoint

L7 Cloud Implementations

AWS: Application Load Balancer (ALB), API Gateway
GCP: HTTP(S) Load Balancing, Cloud Run
Azure: Application Gateway, Front Door
On-premise/Kubernetes: NGINX, Envoy, HAProxy, Traefik

Performance Characteristics

L7 load balancers have different performance profiles:

Lower connection capacity: TCP termination consumes memory per connection
Higher latency: 1-5ms typical for processing (TLS termination, header parsing)
CPU-intensive: TLS handshakes and HTTP parsing require significant CPU

Detailed Comparison

Aspect	Layer 4	Layer 7
Decision Data	IP + Port only	Full HTTP request
Protocol Support	Any TCP/UDP	HTTP, HTTPS, gRPC, WebSocket
TLS Termination	Pass-through or terminate	Always terminates (inspects content)
Connection Handling	Forwards packets	Proxy (two connections)
Latency Added	Microseconds	Milliseconds
Throughput	Very high (line rate)	Limited by CPU
Session Stickiness	IP-based	Cookie-based (more reliable)
Health Checks	TCP connect, port check	HTTP status, response content
WebSocket Support	Native (just TCP)	Requires specific support
Cost (Cloud)	Lower	Higher (more processing)

When to Use Layer 4

L4 load balancing is the right choice for:

Non-HTTP Protocols

Database connections: MySQL, PostgreSQL, Redis
Message queues: Kafka, RabbitMQ
Custom TCP protocols: Game servers, IoT
SMTP, IMAP: Email servers

Maximum Performance

High-frequency trading: Every microsecond matters
Massive connection counts: Millions of concurrent connections
Bandwidth-intensive: Video streaming, large file transfers

TLS Passthrough

When backends must terminate TLS themselves (for mTLS, certificate pinning, or compliance), L4 passthrough preserves end-to-end encryption.

When to Use Layer 7

L7 load balancing is essential for:

Microservices Routing

Route requests to different services based on URL path:

/users/* to user-service
/orders/* to order-service
/search/* to search-service

Canary Deployments

Gradually shift traffic to new versions based on headers or percentage:

# 5% of traffic to new version
backends:
  - name: v1
    weight: 95
  - name: v2
    weight: 5

Authentication Offload

Validate JWTs, API keys, or OAuth tokens before traffic reaches backends, reducing backend complexity.

Rate Limiting

Apply rate limits per endpoint, user, or API key at the edge.

Header Manipulation

Add security headers (HSTS, CSP), request IDs, or client identification headers.

Combining L4 and L7

Production architectures often combine both layers:

Internet
    │
    ▼
┌─────────────────┐
│   L4 (NLB)      │  ← Global entry point, DDoS protection
│   TCP/443       │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   L7 (ALB)      │  ← HTTP routing, TLS termination
│   HTTPS         │
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌───────┐ ┌───────┐
│ Svc A │ │ Svc B │
└───────┘ └───────┘

This pattern uses:

L4 for ingress: Handles DDoS at network layer, passes TLS through
L7 for routing: Path-based routing, authentication, rate limiting

For more on global architectures, see our guide on global load balancing.

Real-World Example: E-Commerce Platform

Consider an e-commerce platform with these requirements:

Web frontend (HTTP)
API backend (HTTPS/gRPC)
Redis cache (TCP 6379)
PostgreSQL database (TCP 5432)

Architecture

ALB (L7): Routes /api/* to API service, /* to web frontend
NLB (L4): Frontend for Redis cluster (client-side routing handled separately)
NLB (L4): PostgreSQL read replica load balancing

Key Takeaways

L4: Fast, simple, works with any TCP/UDP protocol
L7: Flexible routing based on HTTP content, higher overhead
Use L4 for databases, message queues, and high-performance scenarios
Use L7 for HTTP routing, authentication, and microservices
Combine both layers for defense in depth and optimal performance

Need Help Designing Your Load Balancing Architecture?

Our engineers can help you choose the right approach for your workloads. Get in touch for a consultation.