
Introduction
HAProxy is one of the most widely used load balancers and reverse proxies in modern infrastructure. It often sits directly in front of your most critical services, routing traffic to application servers, terminating TLS, enforcing session persistence, and protecting uptime during traffic spikes or backend failures. Because of that, load testing HAProxy is not just about measuring raw requests per second. It is about validating how your traffic management layer behaves under realistic production conditions.
A proper HAProxy load testing strategy helps you answer questions like:
- Can HAProxy sustain expected peak throughput?
- Does it distribute traffic evenly across backend pools?
- How does it behave during backend degradation or failover?
- Do sticky sessions work correctly under concurrency?
- Are health checks, connection reuse, and timeouts tuned properly?
In this HAProxy load testing guide, you will learn how to use LoadForge and Locust to simulate real traffic patterns against HAProxy frontends and backends. We will cover baseline performance testing, authenticated API traffic, session persistence validation, and failover-focused stress testing. Along the way, we will use realistic endpoint paths, headers, and request payloads that mirror actual HAProxy deployments in front of web applications and APIs.
LoadForge makes this easier by providing cloud-based infrastructure, distributed testing, real-time reporting, CI/CD integration, and global test locations so you can test HAProxy under conditions that resemble real user traffic.
Prerequisites
Before you begin load testing HAProxy, make sure you have the following:
- A running HAProxy instance exposed through a frontend URL such as:
https://lb.example.comhttps://api.example.com
- At least one backend application pool behind HAProxy
- Access to relevant HAProxy configuration details, including:
- frontend routes
- backend pools
- health check behavior
- session persistence strategy
- authentication requirements
- A staging or pre-production environment that mirrors production as closely as possible
- LoadForge account access
- Basic familiarity with Python and Locust
It is also helpful to know whether your HAProxy setup uses:
- round-robin balancing
- least-connections balancing
- sticky sessions via cookies
- TLS termination
- rate limiting or ACLs
- active health checks
- retries and timeout settings
If you have access to the HAProxy stats endpoint, that can provide valuable context during performance testing.
Here is a simple example of what an HAProxy frontend and backend might look like in practice:
frontend https_front
bind *:443 ssl crt /etc/haproxy/certs/site.pem
mode http
option forwardfor
http-request set-header X-Forwarded-Proto https
default_backend app_servers
backend app_servers
mode http
balance roundrobin
cookie SERVERID insert indirect nocache
option httpchk GET /healthz
server app1 10.0.1.10:8080 check cookie app1
server app2 10.0.1.11:8080 check cookie app2
server app3 10.0.1.12:8080 check cookie app3This kind of setup is exactly what you want to validate with load testing: routing, persistence, health checks, and throughput under concurrent traffic.
Understanding HAProxy Under Load
HAProxy is designed for high performance, but its behavior under load depends on more than just CPU and memory. Since it acts as an intermediary between clients and backend services, bottlenecks can emerge at several layers.
How HAProxy handles concurrent traffic
Under load, HAProxy accepts incoming client connections, applies frontend rules, selects a backend server based on the balancing algorithm, and forwards traffic while managing connection state. This means performance is influenced by:
- connection setup rate
- keep-alive behavior
- TLS handshake overhead
- backend response times
- queueing when backends are saturated
- health check frequency
- session stickiness logic
- retry and timeout settings
If your backend is slow, HAProxy may still appear to be the bottleneck because request queues build at the proxy layer. That is why HAProxy performance testing should always consider both proxy metrics and backend application metrics.
Common HAProxy bottlenecks
When load testing HAProxy, watch for these common issues:
TLS termination overhead
If HAProxy handles HTTPS, CPU can rise sharply during handshake-heavy traffic, especially when clients do not reuse connections.
Backend queue buildup
If backend servers become saturated, HAProxy may start queueing requests. This increases latency and can trigger timeouts.
Misconfigured timeouts
Overly aggressive or overly relaxed timeout values can cause connection churn, retries, or resource exhaustion.
Uneven load distribution
Improper balancing configuration or sticky session behavior can cause one backend node to receive more traffic than others.
Health check instability
Health checks that are too frequent or too expensive can affect both HAProxy and backend performance.
Session persistence problems
Applications that rely on sticky sessions may fail unpredictably if HAProxy cookie insertion or persistence rules are not working correctly under concurrency.
What to validate during testing
A strong HAProxy stress testing plan should validate:
- sustained throughput
- latency percentiles
- backend distribution fairness
- failover behavior when one backend is unhealthy
- session persistence consistency
- API authentication flows through the proxy
- performance from multiple geographic regions
This is where LoadForge is especially useful, since you can launch distributed load tests from global test locations and compare proxy behavior across regions.
Writing Your First Load Test
Your first HAProxy load test should establish a baseline. Start with a simple mix of health checks, homepage traffic, and API reads that pass through HAProxy to the backend pool.
Basic HAProxy baseline test
from locust import HttpUser, task, between
class HAProxyBaselineUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
self.client.headers.update({
"User-Agent": "LoadForge-HAProxy-Baseline/1.0",
"Accept": "application/json, text/html;q=0.9"
})
@task(3)
def homepage(self):
self.client.get("/", name="GET /")
@task(2)
def health_check(self):
self.client.get("/healthz", name="GET /healthz")
@task(5)
def products_api(self):
self.client.get(
"/api/v1/products?category=networking&limit=20",
name="GET /api/v1/products"
)What this test does
This script simulates a lightweight user browsing a site behind HAProxy:
GET /tests general frontend routingGET /healthzvalidates low-cost backend health endpoint accessGET /api/v1/productssimulates realistic read-heavy API traffic
This is useful for baseline load testing because it helps establish:
- average response time through HAProxy
- request throughput
- error rate under moderate concurrency
- whether HAProxy introduces measurable overhead before backend saturation
How to run this effectively in LoadForge
In LoadForge, start with a modest distributed load pattern such as:
- 50 users over 2 minutes
- hold for 5 minutes
- ramp to 200 users over 5 more minutes
This gives you a clean baseline before moving into more aggressive stress testing. In real-time reporting, monitor:
- median and p95 response times
- requests per second
- non-2xx/3xx responses
- signs of latency spikes during ramp-up
If response times stay flat and errors remain low, you have a stable starting point.
Advanced Load Testing Scenarios
Once the baseline is established, move on to more realistic HAProxy performance testing scenarios. These should reflect actual traffic behavior, including authentication, sticky sessions, and failover-sensitive operations.
Scenario 1: Authenticated API traffic through HAProxy
Many HAProxy deployments sit in front of internal or customer-facing APIs that require login and token-based authentication. This test simulates a user logging in, storing a bearer token, and making authenticated requests.
from locust import HttpUser, task, between
import random
class HAProxyAuthenticatedApiUser(HttpUser):
wait_time = between(1, 2)
def on_start(self):
response = self.client.post(
"/api/v1/auth/login",
json={
"email": f"loadtest.user{random.randint(1, 500)}@example.com",
"password": "Str0ngP@ssword!"
},
headers={
"Content-Type": "application/json",
"Accept": "application/json"
},
name="POST /api/v1/auth/login"
)
if response.status_code == 200:
token = response.json().get("access_token")
self.client.headers.update({
"Authorization": f"Bearer {token}",
"Accept": "application/json",
"User-Agent": "LoadForge-HAProxy-Auth/1.0"
})
@task(4)
def get_account_profile(self):
self.client.get("/api/v1/account/profile", name="GET /api/v1/account/profile")
@task(3)
def list_orders(self):
self.client.get(
"/api/v1/orders?status=active&limit=10",
name="GET /api/v1/orders"
)
@task(2)
def create_cart_item(self):
self.client.post(
"/api/v1/cart/items",
json={
"sku": random.choice(["SW-1001", "RTR-2004", "FW-3010"]),
"quantity": random.randint(1, 3)
},
headers={"Content-Type": "application/json"},
name="POST /api/v1/cart/items"
)
@task(1)
def checkout_preview(self):
self.client.post(
"/api/v1/cart/checkout-preview",
json={
"shipping_method": "express",
"region": "us-east-1"
},
headers={"Content-Type": "application/json"},
name="POST /api/v1/cart/checkout-preview"
)Why this scenario matters
This test is valuable because HAProxy often fronts authenticated REST APIs where:
- login endpoints create bursts of short-lived sessions
- authorization headers must pass cleanly to backends
- a mix of reads and writes creates realistic application pressure
- backend performance issues can surface as HAProxy queueing or timeout symptoms
This scenario is ideal for performance testing and stress testing because it exercises more of the application stack while still measuring HAProxy behavior as the gateway layer.
Scenario 2: Validating sticky sessions and persistence cookies
If your application depends on session affinity, you need to confirm that HAProxy consistently routes a user to the same backend. This is especially important for legacy apps, shopping carts, or stateful admin portals.
from locust import HttpUser, task, between
class HAProxyStickySessionUser(HttpUser):
wait_time = between(2, 4)
def on_start(self):
self.client.headers.update({
"User-Agent": "LoadForge-HAProxy-StickySession/1.0",
"Accept": "text/html,application/json"
})
self.client.get("/login", name="GET /login")
self.client.post(
"/session",
data={
"username": "qa_admin",
"password": "AdminPass123!"
},
headers={"Content-Type": "application/x-www-form-urlencoded"},
name="POST /session"
)
@task(4)
def dashboard(self):
self.client.get("/dashboard", name="GET /dashboard")
@task(3)
def user_settings(self):
self.client.get("/account/settings", name="GET /account/settings")
@task(2)
def update_preference(self):
self.client.post(
"/account/preferences",
json={
"theme": "dark",
"notifications": True,
"timezone": "UTC"
},
headers={"Content-Type": "application/json"},
name="POST /account/preferences"
)
@task(1)
def recent_activity(self):
with self.client.get("/api/v1/activity/recent", catch_response=True, name="GET /api/v1/activity/recent") as response:
server_cookie = self.client.cookies.get("SERVERID")
if not server_cookie:
response.failure("Missing HAProxy persistence cookie SERVERID")
else:
response.success()What to look for
This scenario helps validate:
- whether HAProxy inserts the expected persistence cookie
- whether authenticated session flows remain stable under concurrency
- whether requests unexpectedly bounce between backends
- whether backend-specific session storage assumptions cause failures
If your HAProxy configuration uses cookie SERVERID insert, this is a practical way to verify persistence behavior at scale.
Scenario 3: Failover and backend degradation simulation
One of the most important HAProxy load testing use cases is validating failover. You want to know what happens when one backend becomes slow or unhealthy while traffic is still flowing.
This Locust script simulates a more aggressive traffic mix against endpoints likely to expose backend stress. During the test, you can intentionally disable one backend node or make it fail health checks.
from locust import HttpUser, task, between
import random
class HAProxyFailoverUser(HttpUser):
wait_time = between(0.5, 1.5)
def on_start(self):
self.client.headers.update({
"User-Agent": "LoadForge-HAProxy-Failover/1.0",
"Accept": "application/json"
})
@task(5)
def search_catalog(self):
query = random.choice(["firewall", "switch", "router", "load balancer", "vpn appliance"])
self.client.get(
f"/api/v1/search?q={query}&sort=relevance&limit=25",
name="GET /api/v1/search"
)
@task(3)
def product_details(self):
product_id = random.choice([1001, 1002, 1044, 1088, 1205])
self.client.get(
f"/api/v1/products/{product_id}",
name="GET /api/v1/products/:id"
)
@task(2)
def inventory_check(self):
sku = random.choice(["SW-1001", "RTR-2004", "FW-3010", "AP-4400"])
self.client.post(
"/api/v1/inventory/check",
json={
"sku": sku,
"warehouse": random.choice(["us-east", "us-west", "eu-central"])
},
headers={"Content-Type": "application/json"},
name="POST /api/v1/inventory/check"
)How to use this for failover testing
Run this test while you deliberately trigger one of the following in your staging environment:
- stop one backend server
- make one backend return 500 errors
- slow one backend enough to fail health checks
- remove one backend from service discovery or target group membership
Then observe:
- whether HAProxy removes the unhealthy backend quickly
- whether traffic is redistributed smoothly
- whether response times spike temporarily
- whether error rates increase during failover
- whether retries amplify load on remaining backends
LoadForge’s real-time reporting is especially useful here because you can correlate failure injection timing with latency and error rate changes.
Analyzing Your Results
After running a HAProxy load test, it is important to interpret the results in the context of both proxy behavior and backend health.
Key metrics to review
Requests per second
This tells you how much traffic HAProxy can process under your test conditions. Compare throughput before and after backend saturation begins.
Response time percentiles
Average latency is not enough. Focus on:
- p50 for typical experience
- p95 for degraded user experience
- p99 for tail latency under stress
HAProxy issues often show up first in p95 and p99.
Error rate
Look for:
- 502 Bad Gateway
- 503 Service Unavailable
- 504 Gateway Timeout
- connection resets
- authentication failures
- session-related errors
These can indicate backend instability, bad timeout settings, or overloaded frontend workers.
Latency during ramp-up
If latency rises sharply as users increase, that may suggest:
- connection pool exhaustion
- backend queue buildup
- TLS overhead
- insufficient HAProxy process or thread tuning
Behavior during failover
During backend removal or degradation, check whether HAProxy:
- reroutes traffic quickly
- avoids prolonged spikes in 5xx errors
- maintains acceptable response times on surviving nodes
Correlate with HAProxy stats
If possible, compare LoadForge test results with HAProxy stats such as:
- current sessions
- session rate
- bytes in and out
- queue lengths
- backend status
- retries
- denied requests
- response code distribution
This helps distinguish between HAProxy bottlenecks and backend application bottlenecks.
Use distributed tests for realistic insights
HAProxy may behave differently depending on client geography and network distance. With LoadForge, you can run distributed load testing from multiple regions to see how latency, TLS negotiation, and edge routing affect performance. This is especially useful if HAProxy fronts global APIs or multi-region services.
Performance Optimization Tips
If your HAProxy performance testing reveals bottlenecks, these are common optimization areas to investigate.
Tune connection reuse
Encourage keep-alive and connection reuse where possible. Excessive connection churn increases HAProxy overhead, especially for HTTPS traffic.
Review timeout settings
Check values for:
timeout connecttimeout clienttimeout servertimeout http-request
Bad timeout values can either drop healthy traffic too early or waste resources on stalled sessions.
Optimize TLS
If HAProxy terminates TLS:
- use modern ciphers
- enable session reuse where appropriate
- benchmark with and without TLS termination to understand overhead
- monitor CPU closely during handshake-heavy tests
Validate balancing algorithm choice
Round-robin is common, but some workloads benefit from least-connections or other strategies. Load testing helps verify whether your algorithm matches actual traffic patterns.
Scale backend capacity
Sometimes HAProxy is healthy, but backends are too slow. If queue times rise while HAProxy remains responsive, the real fix may be backend scaling, caching, or query optimization.
Keep health checks lightweight
Health endpoints should be fast and cheap. Avoid expensive database checks in high-frequency health probes unless absolutely necessary.
Test with realistic user journeys
Synthetic tests that only hit /healthz or / are not enough. Include authenticated requests, writes, and session-dependent flows to get meaningful HAProxy load testing results.
Common Pitfalls to Avoid
HAProxy load testing is easy to get wrong if the test does not reflect how the proxy is actually used.
Testing only the backend, not the proxy
Make sure your LoadForge target URL points to the HAProxy frontend, not directly to backend application servers.
Ignoring TLS costs
If production uses HTTPS, test HTTPS. Otherwise, your performance testing will underestimate HAProxy CPU usage and latency.
Using unrealistic traffic mixes
Real systems have a mix of reads, writes, login flows, static assets, and API calls. A single repeated endpoint rarely tells the full story.
Forgetting session persistence validation
If your application depends on sticky sessions, you must explicitly test cookie persistence and session continuity.
Not testing failover
HAProxy is often deployed for resilience. If you never simulate backend failure, you are missing one of the most important validation cases.
Overlooking warm-up effects
Caches, TLS session reuse, and backend connection pools may behave differently at the beginning of a test. Include a warm-up period before drawing conclusions.
Running tests from one location only
Single-region testing may hide latency-sensitive issues. Distributed load testing from LoadForge’s global test locations gives a more realistic picture.
Blaming HAProxy for backend issues
High latency at the proxy layer often originates from slow backends. Always correlate proxy metrics with application and infrastructure metrics.
Conclusion
HAProxy is a critical part of modern cloud and infrastructure stacks, and load testing it properly is essential for validating throughput, balancing behavior, session persistence, and failover resilience. With the right Locust scripts, you can move beyond simple request flooding and simulate realistic traffic patterns that expose real operational risks.
Using LoadForge, you can run cloud-based, distributed HAProxy load testing with real-time reporting, CI/CD integration, and scalable infrastructure that mirrors production conditions far better than local tools alone. Whether you are validating a new HAProxy configuration, preparing for peak traffic, or stress testing backend failover, LoadForge gives you the visibility you need to test with confidence.
Try LoadForge to build and run your HAProxy performance testing strategy today.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

AWS Lambda Load Testing Guide
Learn how to load test AWS Lambda functions with LoadForge to measure cold starts, concurrency, and serverless scaling.

Azure Load Testing Guide with LoadForge
Discover how to load test Azure-hosted apps and services with LoadForge for better scalability, reliability, and response times.

DigitalOcean Load Testing Guide
Load test DigitalOcean apps, droplets, and APIs with LoadForge to uncover limits and optimize performance at scale.