
Introduction
Running your application on Heroku makes deployment and operations simpler, but simplicity does not remove the need for serious load testing. Heroku apps still need to handle traffic spikes, dyno restarts, router timeouts, database connection limits, and the realities of horizontal scaling. A Heroku app that looks healthy in development or under light usage can still struggle in production when many users hit the same endpoints at once.
This Heroku load testing guide shows you how to use LoadForge to validate production readiness through realistic performance testing and stress testing. You will learn how to simulate traffic against common Heroku app patterns, including web dynos serving API requests, login-protected workflows, and database-heavy operations. Because LoadForge uses Locust under the hood, every example is a practical Python script you can adapt directly to your own app.
By the end of this guide, you will know how to:
- Load test Heroku web applications and APIs
- Measure response times and throughput under concurrent traffic
- Validate dyno scaling behavior
- Detect bottlenecks in routing, authentication, and database access
- Use LoadForge’s distributed testing, real-time reporting, and global test locations to test from realistic user regions
Prerequisites
Before you start load testing a Heroku app, make sure you have the following:
- A deployed Heroku application, such as:
https://your-app-name.herokuapp.com- or a custom domain like
https://api.example.com
- A clear understanding of your key user flows:
- homepage or landing page
- login and authenticated requests
- search or listing endpoints
- checkout, submission, or account actions
- Test credentials for non-production or staging environments
- API keys, bearer tokens, or session-based auth details if your app uses authentication
- A LoadForge account to run cloud-based load testing at scale
- Permission to test the target environment, especially if autoscaling, add-ons, or rate limits may be affected
It is also a good idea to prepare a staging app on Heroku that mirrors production as closely as possible. Heroku performance testing on a weak staging environment often produces misleading results if the dyno type, Postgres plan, Redis plan, or worker setup differs too much from production.
Understanding Heroku Under Load
Heroku applications handle traffic through a routing layer that distributes incoming requests to web dynos. Under load, several Heroku-specific constraints and behaviors become important.
Dyno concurrency and request handling
Each web dyno runs your application process. Depending on your stack, framework, and server configuration, a dyno may only handle a limited number of concurrent requests efficiently. For example:
- A Python app using Gunicorn with too few workers can queue requests
- A Node.js app may slow down if CPU-heavy work blocks the event loop
- A Ruby app can hit thread or worker limits
- Any app can degrade if memory pressure causes swapping or restarts
Load testing helps reveal when a single dyno saturates and whether adding dynos improves throughput as expected.
Heroku router timeouts
Heroku’s router has a well-known request timeout behavior. Long-running requests may fail if your application does not respond quickly enough. A load test can expose endpoints that become dangerously slow under concurrency, especially those involving:
- large database queries
- external API calls
- synchronous file processing
- report generation
Database connection limits
Heroku Postgres plans have connection limits, and many apps hit this bottleneck before CPU becomes the issue. If your app scales web dynos without proper connection pooling, performance can collapse under load. Symptoms include:
- rising latency
- intermittent 500 errors
- connection timeout errors
- throughput flattening even as dynos scale
Autoscaling and cold behavior
If you use Heroku autoscaling or scheduled dyno changes, performance testing can help validate whether scaling reacts fast enough to traffic growth. You may also want to observe behavior after dyno restarts or deploys, when caches are cold and application startup paths matter more.
Common Heroku bottlenecks
When load testing Heroku apps, the most common issues are:
- too few Gunicorn workers or Puma workers
- missing database indexes
- unbounded N+1 queries
- synchronous background work in web requests
- no caching for expensive endpoints
- session/auth bottlenecks
- hitting Redis or Postgres plan limits
- poor horizontal scaling due to shared resource contention
Writing Your First Load Test
The best first step is a simple baseline test against a few public endpoints. This confirms your Heroku app is reachable, stable, and capable of serving anonymous traffic with acceptable response times.
Below is a realistic Locust script for a Heroku-hosted SaaS app with a homepage, pricing page, health endpoint, and public API status endpoint.
from locust import HttpUser, task, between
class HerokuPublicUser(HttpUser):
wait_time = between(1, 3)
@task(4)
def homepage(self):
self.client.get("/", name="GET /")
@task(2)
def pricing(self):
self.client.get("/pricing", name="GET /pricing")
@task(2)
def healthcheck(self):
self.client.get("/health", name="GET /health")
@task(1)
def api_status(self):
self.client.get("/api/v1/status", name="GET /api/v1/status")
@task(1)
def docs(self):
self.client.get("/docs", name="GET /docs")What this test does
This script simulates anonymous users browsing a Heroku app. The weighted tasks reflect a realistic traffic mix:
- the homepage gets the most traffic
- pricing and health endpoints are accessed regularly
- status and docs endpoints receive lower traffic
Why this matters for Heroku performance testing
A basic test like this helps you answer foundational questions:
- How many requests per second can one or more dynos handle?
- Are static and lightweight dynamic pages responding consistently?
- Does latency increase sharply as concurrency rises?
- Are there any immediate 5xx errors from overloaded dynos?
In LoadForge, you can run this test from multiple geographic regions to see whether response times vary by user location. This is especially useful if your Heroku app is hosted in one region but serves a global audience.
Advanced Load Testing Scenarios
Once the baseline is established, move on to realistic user flows. Heroku load testing is most valuable when it targets the endpoints that drive business value and infrastructure stress.
Scenario 1: Session-based login and authenticated dashboard usage
Many Heroku apps use cookie-based session authentication. The following example simulates a user logging in, loading a dashboard, viewing projects, and checking account settings.
from locust import HttpUser, task, between
class HerokuSessionUser(HttpUser):
wait_time = between(2, 5)
def on_start(self):
login_page = self.client.get("/login", name="GET /login")
csrf_token = None
if 'csrf-token' in login_page.text:
marker = 'content="'
start = login_page.text.find(marker)
if start != -1:
start += len(marker)
end = login_page.text.find('"', start)
csrf_token = login_page.text[start:end]
payload = {
"email": "loadtest.user@example.com",
"password": "SuperSecret123!",
}
headers = {}
if csrf_token:
payload["_csrf"] = csrf_token
headers["X-CSRF-Token"] = csrf_token
with self.client.post(
"/session",
data=payload,
headers=headers,
name="POST /session",
catch_response=True
) as response:
if response.status_code not in (200, 302):
response.failure(f"Login failed: {response.status_code}")
@task(4)
def dashboard(self):
self.client.get("/dashboard", name="GET /dashboard")
@task(3)
def projects(self):
self.client.get("/projects", name="GET /projects")
@task(2)
def project_detail(self):
self.client.get("/projects/42", name="GET /projects/:id")
@task(1)
def account_settings(self):
self.client.get("/account/settings", name="GET /account/settings")What this scenario reveals
This test is useful for measuring:
- login performance under concurrent session creation
- cookie and session store overhead
- dashboard rendering cost
- authenticated database reads
- whether dynos handle mixed authenticated traffic efficiently
For Heroku apps using Redis-backed sessions, this can also help identify whether Redis becomes a bottleneck under login-heavy traffic.
Scenario 2: Token-authenticated API load test for a Heroku backend
Many Heroku apps expose JSON APIs for SPAs, mobile apps, or partner integrations. This example simulates bearer token authentication and common CRUD-style API usage.
from locust import HttpUser, task, between
import random
class HerokuApiUser(HttpUser):
wait_time = between(1, 2)
def on_start(self):
auth_payload = {
"email": "api.loadtest@example.com",
"password": "SuperSecret123!"
}
with self.client.post(
"/api/v1/auth/login",
json=auth_payload,
name="POST /api/v1/auth/login",
catch_response=True
) as response:
if response.status_code == 200 and "token" in response.json():
self.token = response.json()["token"]
self.headers = {"Authorization": f"Bearer {self.token}"}
else:
response.failure("API login failed")
self.token = None
self.headers = {}
@task(5)
def list_orders(self):
self.client.get(
"/api/v1/orders?status=open&limit=25",
headers=self.headers,
name="GET /api/v1/orders"
)
@task(3)
def get_order_detail(self):
order_id = random.choice([1001, 1002, 1003, 1004, 1005])
self.client.get(
f"/api/v1/orders/{order_id}",
headers=self.headers,
name="GET /api/v1/orders/:id"
)
@task(2)
def create_order(self):
payload = {
"customer_id": 501,
"currency": "USD",
"items": [
{"sku": "starter-plan", "quantity": 1, "unit_price": 2900},
{"sku": "priority-support", "quantity": 1, "unit_price": 900}
],
"notes": "Created by LoadForge performance test"
}
self.client.post(
"/api/v1/orders",
json=payload,
headers=self.headers,
name="POST /api/v1/orders"
)
@task(1)
def order_metrics(self):
self.client.get(
"/api/v1/reports/orders/daily?days=30",
headers=self.headers,
name="GET /api/v1/reports/orders/daily"
)Why this is realistic for Heroku
This pattern is common for Heroku-hosted APIs backed by Postgres and Redis. It exercises:
- token issuance
- authenticated reads and writes
- pagination and filtering
- reporting endpoints that may be more database-intensive
This type of performance testing is ideal for validating whether your Heroku dynos and Postgres database can support real API usage without excessive latency.
Scenario 3: Database-heavy search and background-job-triggering workflow
A lot of Heroku apps struggle not on simple page loads, but on expensive search endpoints and actions that trigger downstream processing. This example simulates a product catalog search plus a report export request.
from locust import HttpUser, task, between
import random
class HerokuSearchUser(HttpUser):
wait_time = between(2, 4)
search_terms = [
"wireless keyboard",
"usb-c dock",
"monitor stand",
"mechanical keyboard",
"noise cancelling headset"
]
categories = [
"electronics",
"accessories",
"office"
]
@task(5)
def search_products(self):
term = random.choice(self.search_terms)
category = random.choice(self.categories)
self.client.get(
f"/api/v1/products/search?q={term}&category={category}&sort=popularity&page=1",
name="GET /api/v1/products/search"
)
@task(3)
def filter_collection(self):
self.client.get(
"/collections/spring-deals?price_min=25&price_max=250&in_stock=true",
name="GET /collections/:slug"
)
@task(1)
def product_detail(self):
product_id = random.choice([231, 455, 678, 890, 991])
self.client.get(
f"/products/{product_id}",
name="GET /products/:id"
)
@task(1)
def trigger_export(self):
payload = {
"format": "csv",
"date_range": "last_30_days",
"include_refunds": True,
"email_to": "ops-team@example.com"
}
self.client.post(
"/api/v1/exports/sales",
json=payload,
name="POST /api/v1/exports/sales"
)What this scenario helps you validate
This is a strong Heroku stress testing scenario for apps with search, filtering, and reporting. It can reveal:
- slow SQL queries
- missing indexes
- expensive sorting and filtering
- overloaded background queues
- web dynos doing too much synchronous work
If export requests are handled correctly, the web dyno should respond quickly while a worker dyno processes the job asynchronously. If not, you may see response times climb and router timeouts appear.
Optional Locust configuration for staged traffic growth
When testing dyno scaling on Heroku, it is helpful to ramp users gradually instead of spiking all at once. In LoadForge, you can configure this in the UI, but here is a Locust example using stages logic.
from locust import HttpUser, task, between, LoadTestShape
class WebsiteUser(HttpUser):
wait_time = between(1, 3)
@task
def index(self):
self.client.get("/")
class HerokuRampShape(LoadTestShape):
stages = [
{"duration": 300, "users": 50, "spawn_rate": 5},
{"duration": 600, "users": 100, "spawn_rate": 10},
{"duration": 900, "users": 250, "spawn_rate": 15},
{"duration": 1200, "users": 500, "spawn_rate": 20},
]
def tick(self):
run_time = self.get_run_time()
for stage in self.stages:
if run_time < stage["duration"]:
return (stage["users"], stage["spawn_rate"])
return NoneThis kind of ramp test is excellent for checking whether Heroku autoscaling, dyno formation changes, or database capacity keep up as demand increases.
Analyzing Your Results
After your Heroku load test completes, the next step is interpreting the data correctly. LoadForge provides real-time reporting that makes it easy to track response times, throughput, failures, and percentile trends during the test.
Focus on these metrics first
Response time percentiles
Average response time is useful, but p95 and p99 are more important. On Heroku, a few overloaded dynos or slow database queries can create long-tail latency even if the average looks acceptable.
Watch for:
- p95 steadily increasing as users ramp up
- p99 spiking sharply during write-heavy or search-heavy tasks
- sudden jumps that may indicate dyno saturation or database contention
Requests per second
Throughput should rise as you increase user load, at least until a bottleneck is reached. If requests per second stop increasing while latency and failures rise, you have likely hit a capacity limit.
Error rates
Pay close attention to:
- 500 and 503 responses
- auth failures under concurrency
- timeout errors
- connection-related failures
On Heroku, these often point to overloaded dynos, exhausted database connections, or app-level exceptions triggered by concurrency.
Endpoint-level breakdown
LoadForge and Locust both make it easy to see which endpoints are slowest. This is critical for Heroku apps because one expensive route can affect the perceived health of the whole system.
Compare performance against Heroku metrics
For best results, correlate your load testing data with Heroku dashboard metrics and add-on telemetry:
- dyno load and memory usage
- response time trends
- throughput
- Postgres connection count
- slow query logs
- Redis memory and ops/sec
- worker queue depth
If LoadForge shows rising latency at the same time Heroku Postgres nears its connection limit, you have a clear optimization target.
Test from realistic regions
If your Heroku app runs in the US but your users are in Europe and Asia, response times may differ significantly. LoadForge’s global test locations let you measure realistic regional performance instead of relying on a single-source synthetic test.
Performance Optimization Tips
If your Heroku performance testing reveals problems, these are the first areas to investigate.
Right-size your dynos
If CPU or memory is saturated, move to larger dynos or add more dynos. But do not assume more dynos always solve the problem. If the real bottleneck is Postgres or Redis, horizontal scaling may simply move the pressure elsewhere.
Tune your app server
For Python apps on Heroku, review Gunicorn settings carefully:
- number of workers
- worker class
- timeout settings
- max requests and recycling
Too few workers can cause request queuing. Too many can exhaust memory or database connections.
Add connection pooling
For Heroku Postgres, use connection pooling where appropriate. This is especially important when scaling web dynos. Without pooling, each dyno process may open too many direct connections.
Optimize expensive queries
Use query analysis to find:
- full table scans
- missing indexes
- N+1 query patterns
- unnecessary joins
- oversized result sets
Search and reporting endpoints are frequent offenders.
Move slow work to background jobs
If requests trigger exports, emails, image processing, or third-party API calls, push that work to worker dynos. Web dynos should return quickly and avoid long synchronous operations.
Cache aggressively where it helps
Cache hot endpoints, expensive computed fragments, and repeated lookup data. Redis is often a strong fit for Heroku-hosted apps that need to reduce database pressure.
Test scaling changes before production
Whenever you change dyno counts, Gunicorn settings, database plans, or caching strategy, rerun the same LoadForge test. Consistent test scenarios make it easier to compare before-and-after performance objectively.
Common Pitfalls to Avoid
Heroku load testing can go wrong if the test setup is unrealistic or incomplete. Avoid these common mistakes.
Testing only the homepage
A homepage test is useful, but it rarely reveals the real bottlenecks. Most production issues come from authenticated flows, search, writes, and reporting endpoints.
Ignoring login and session behavior
Authentication often becomes a bottleneck under load. If your app uses session cookies, CSRF protection, or token refresh flows, include them in your test.
Overlooking database limits
Many teams blame dynos when the real issue is Postgres connection exhaustion or slow queries. Always monitor the database during load testing.
Running unrealistic traffic patterns
A test that sends only one endpoint at maximum speed does not reflect normal user behavior. Use weighted tasks and realistic wait times to simulate real traffic.
Testing production without safeguards
Never point aggressive stress testing at production without approval, rate controls, and a rollback plan. Heroku autoscaling, add-ons, and external integrations can incur real cost and risk.
Forgetting third-party dependencies
If your Heroku app depends on payment gateways, email APIs, search providers, or analytics services, those dependencies can shape performance too. Mock them where necessary, or account for their latency in staging.
Not validating horizontal scaling
If you scale from 2 dynos to 6 dynos, throughput should improve meaningfully. If it does not, your architecture may have a shared bottleneck that load testing needs to uncover.
Conclusion
Heroku makes deployment easy, but production readiness still depends on disciplined load testing, performance testing, and stress testing. By simulating realistic traffic with Locust-based scripts in LoadForge, you can validate dyno scaling, identify slow endpoints, catch database bottlenecks, and improve user experience before traffic spikes expose weaknesses.
LoadForge gives you the tools to do this efficiently with distributed testing, cloud-based infrastructure, real-time reporting, CI/CD integration, and global test locations. If you want to load test your Heroku app with realistic scenarios and actionable results, now is a great time to try LoadForge.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

Apache Load Testing Guide with LoadForge
Load test Apache web servers with LoadForge to benchmark request handling, concurrency, and overall site performance.

AWS Load Testing Guide with LoadForge
Learn how to load test AWS applications and APIs with LoadForge to find bottlenecks, measure scale, and improve performance.

Azure Functions Load Testing Guide
Load test Azure Functions with LoadForge to evaluate cold starts, throughput, and scaling behavior under peak demand.