
Introduction
REST APIs are the backbone of modern applications. Whether you’re powering a mobile app, a single-page web application, third-party integrations, or internal microservices, your API needs to respond quickly and reliably under real-world traffic. A REST API that performs well with a handful of users can still struggle when hundreds or thousands of clients make concurrent requests, submit large payloads, or trigger expensive backend operations.
That’s why load testing REST APIs is essential. With effective load testing, performance testing, and stress testing, you can measure latency, throughput, error rates, and scalability before production traffic exposes weaknesses. You can identify bottlenecks in authentication flows, database-backed endpoints, search APIs, reporting jobs, and write-heavy operations.
LoadForge makes this process much easier by combining the flexibility of Locust with cloud-based infrastructure, distributed testing, real-time reporting, global test locations, and CI/CD integration. In this guide, you’ll learn how to build realistic REST API load tests using Locust scripts in LoadForge, starting with a basic read-only test and progressing to more advanced scenarios like authenticated workflows, CRUD operations, and asynchronous job polling.
Prerequisites
Before you start load testing a REST API with LoadForge, make sure you have the following:
- A REST API environment to test
- Preferably a staging or pre-production environment
- Avoid load testing production unless you have explicit safeguards in place
- API documentation
- OpenAPI/Swagger specs are especially helpful
- You should know endpoint paths, request bodies, headers, and expected status codes
- Test credentials
- API keys, bearer tokens, OAuth client credentials, or test user accounts
- Representative test data
- Product IDs, user accounts, order records, search terms, or other realistic payload inputs
- Performance goals
- Example: p95 latency under 300ms for GET endpoints
- Example: error rate below 1% at 500 requests per second
- A LoadForge account
- So you can run distributed load tests, monitor real-time results, and compare runs over time
It also helps to decide what type of test you want to run:
- Load testing: Validate expected traffic levels
- Stress testing: Push beyond expected capacity to find breaking points
- Spike testing: Simulate sudden bursts of traffic
- Endurance testing: Run sustained traffic over time to detect memory leaks or resource exhaustion
Understanding REST APIs Under Load
REST APIs can appear simple on the surface, but their behavior under load depends on several layers of the stack. A single GET /api/products request might involve authentication, rate limiting, cache lookups, database queries, serialization, and network overhead. When concurrency rises, these layers can become bottlenecks.
Common REST API bottlenecks
Authentication and authorization
JWT validation, OAuth token introspection, and permission checks can add measurable overhead. If every request requires expensive auth validation, performance can degrade quickly.
Database contention
Endpoints that create, update, or search records often hit relational or NoSQL databases. Under load, you may see:
- Slow queries
- Connection pool exhaustion
- Lock contention
- Increased transaction times
Serialization and payload size
JSON encoding and decoding is usually fast, but large nested payloads can increase CPU usage and response times, especially for list or reporting endpoints.
External service dependencies
REST APIs often call payment gateways, email providers, search engines, or internal microservices. These downstream dependencies can fail or slow down under traffic, causing cascading latency.
Caching behavior
Read-heavy APIs may perform well when cache hit rates are high, but degrade badly on cache misses. Load testing should include enough variation to expose realistic cache patterns.
Rate limiting and throttling
Some APIs intentionally reject or delay excess traffic. This is not always a failure, but you need to understand when 429 responses are expected versus when they indicate misconfiguration.
What to measure during REST API performance testing
When you load test REST APIs, focus on these core metrics:
- Response time
- Average, median, p95, and p99 latency
- Throughput
- Requests per second
- Error rate
- 4xx and 5xx responses
- Concurrency
- Number of active users or requests in flight
- Endpoint-specific performance
- Compare login, search, create, update, and delete operations separately
- Infrastructure behavior
- CPU, memory, DB utilization, and connection pool usage
LoadForge’s real-time reporting helps you see these metrics as the test runs, making it easier to spot endpoint-level regressions and capacity limits.
Writing Your First Load Test
A good first REST API load test usually targets read-only endpoints. These are common, easy to validate, and often represent the majority of production traffic.
Let’s assume you have an e-commerce API with the following endpoints:
GET /api/v1/healthGET /api/v1/productsGET /api/v1/products/{id}GET /api/v1/categories
This first Locust script simulates users browsing product data.
from locust import HttpUser, task, between
import random
class RestApiBrowserUser(HttpUser):
wait_time = between(1, 3)
product_ids = [101, 102, 103, 104, 105]
category_slugs = ["electronics", "books", "home", "fitness"]
def on_start(self):
self.client.headers.update({
"Accept": "application/json",
"User-Agent": "LoadForge-REST-API-Test/1.0"
})
@task(1)
def health_check(self):
with self.client.get("/api/v1/health", name="GET /health", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Health check failed: {response.status_code}")
@task(5)
def list_products(self):
params = {
"page": random.randint(1, 5),
"limit": 20,
"sort": "popularity"
}
with self.client.get("/api/v1/products", params=params, name="GET /products", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Unexpected status code: {response.status_code}")
return
data = response.json()
if "items" not in data or not isinstance(data["items"], list):
response.failure("Missing or invalid 'items' array")
@task(3)
def get_product_detail(self):
product_id = random.choice(self.product_ids)
with self.client.get(f"/api/v1/products/{product_id}", name="GET /products/:id", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Failed to fetch product {product_id}")
return
data = response.json()
if "id" not in data or data["id"] != product_id:
response.failure("Product response did not match requested ID")
@task(2)
def list_category_products(self):
category = random.choice(self.category_slugs)
params = {"category": category, "limit": 12}
with self.client.get("/api/v1/products", params=params, name="GET /products?category", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Category listing failed for {category}")What this script does
This script models a simple browsing user:
- It checks API health
- Lists products with pagination
- Fetches individual product details
- Filters products by category
The task weights make product listing more frequent than health checks, which is more realistic. It also validates response structure instead of only checking status codes.
Why this is a good starting point
For REST API load testing, this type of script helps you answer basic but important questions:
- Can the API serve read traffic at expected concurrency?
- Are list endpoints slower than detail endpoints?
- Do paginated endpoints degrade with larger datasets?
- Are there errors or malformed responses under load?
In LoadForge, you can run this script across distributed generators to simulate traffic from multiple regions and observe how latency and throughput change as user counts increase.
Advanced Load Testing Scenarios
Once you’ve validated basic read traffic, the next step is to simulate more realistic API behavior. Most REST APIs involve authentication, writes, filtering, and asynchronous processing.
Authenticated API load testing with JWT tokens
A common REST API pattern is logging in with email and password, receiving a bearer token, and then using that token for protected endpoints.
from locust import HttpUser, task, between
import random
class AuthenticatedApiUser(HttpUser):
wait_time = between(1, 2)
users = [
{"email": "qa.user1@example.com", "password": "TestPass123!"},
{"email": "qa.user2@example.com", "password": "TestPass123!"},
{"email": "qa.user3@example.com", "password": "TestPass123!"},
]
def on_start(self):
credentials = random.choice(self.users)
login_payload = {
"email": credentials["email"],
"password": credentials["password"]
}
with self.client.post(
"/api/v1/auth/login",
json=login_payload,
headers={"Content-Type": "application/json", "Accept": "application/json"},
name="POST /auth/login",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Login failed for {credentials['email']}: {response.status_code}")
return
data = response.json()
token = data.get("access_token")
if not token:
response.failure("No access token returned")
return
self.client.headers.update({
"Authorization": f"Bearer {token}",
"Accept": "application/json",
"Content-Type": "application/json"
})
@task(4)
def get_profile(self):
with self.client.get("/api/v1/users/me", name="GET /users/me", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Profile fetch failed: {response.status_code}")
@task(3)
def list_orders(self):
params = {
"status": random.choice(["pending", "shipped", "delivered"]),
"limit": 10
}
with self.client.get("/api/v1/orders", params=params, name="GET /orders", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Order listing failed: {response.status_code}")
@task(2)
def get_order_detail(self):
order_id = random.choice([5001, 5002, 5003, 5004])
with self.client.get(f"/api/v1/orders/{order_id}", name="GET /orders/:id", catch_response=True) as response:
if response.status_code not in [200, 404]:
response.failure(f"Unexpected status for order {order_id}: {response.status_code}")Why this scenario matters
Authenticated traffic often behaves differently from public traffic because it adds:
- Login overhead
- Token generation or validation
- User-specific database queries
- Permission checks
This type of performance testing is especially important for SaaS applications, customer portals, and mobile backends.
Testing CRUD operations with realistic write traffic
Read-heavy traffic is only part of the picture. Many APIs also need to handle cart updates, order creation, profile updates, or ticket submissions. Write endpoints often expose database bottlenecks more quickly than reads.
Here’s a realistic test for a task management REST API.
from locust import HttpUser, task, between
import random
import uuid
class TaskApiUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
auth_payload = {
"email": "loadtest.manager@example.com",
"password": "SecurePass456!"
}
response = self.client.post(
"/api/v1/auth/login",
json=auth_payload,
headers={"Content-Type": "application/json", "Accept": "application/json"}
)
if response.status_code == 200:
token = response.json().get("access_token")
self.client.headers.update({
"Authorization": f"Bearer {token}",
"Accept": "application/json",
"Content-Type": "application/json"
})
@task(4)
def list_tasks(self):
params = {
"project_id": random.choice([2001, 2002, 2003]),
"status": random.choice(["open", "in_progress", "done"]),
"limit": 25
}
self.client.get("/api/v1/tasks", params=params, name="GET /tasks")
@task(2)
def create_task(self):
task_id = str(uuid.uuid4())[:8]
payload = {
"project_id": random.choice([2001, 2002, 2003]),
"title": f"Load test task {task_id}",
"description": "Created during REST API performance testing with LoadForge",
"priority": random.choice(["low", "medium", "high"]),
"assignee_id": random.choice([301, 302, 303]),
"tags": ["load-test", "api", "locust"]
}
with self.client.post("/api/v1/tasks", json=payload, name="POST /tasks", catch_response=True) as response:
if response.status_code != 201:
response.failure(f"Task creation failed: {response.status_code}")
return
data = response.json()
created_id = data.get("id")
if not created_id:
response.failure("Created task response missing ID")
return
self.created_task_id = created_id
@task(1)
def update_task(self):
task_id = getattr(self, "created_task_id", None)
if not task_id:
return
payload = {
"status": random.choice(["in_progress", "done"]),
"priority": random.choice(["medium", "high"])
}
with self.client.patch(f"/api/v1/tasks/{task_id}", json=payload, name="PATCH /tasks/:id", catch_response=True) as response:
if response.status_code not in [200, 204]:
response.failure(f"Task update failed: {response.status_code}")
@task(1)
def delete_task(self):
task_id = getattr(self, "created_task_id", None)
if not task_id:
return
with self.client.delete(f"/api/v1/tasks/{task_id}", name="DELETE /tasks/:id", catch_response=True) as response:
if response.status_code not in [200, 204]:
response.failure(f"Task deletion failed: {response.status_code}")
else:
self.created_task_id = NoneWhat this test reveals
This script is useful for stress testing and performance testing write-heavy endpoints because it simulates:
- Listing tasks with filters
- Creating records
- Updating records
- Deleting records
This can reveal issues like:
- Slow insert or update queries
- Transaction lock contention
- Insufficient database indexing
- Connection pool exhaustion
- Increased latency as write concurrency grows
Testing asynchronous REST workflows
Many modern REST APIs return a job ID for expensive operations such as report generation, exports, media processing, or bulk imports. These workflows should be load tested differently from simple request-response endpoints.
Below is a script for a reporting API that creates an export job and polls for completion.
from locust import HttpUser, task, between
import random
import time
class ReportingApiUser(HttpUser):
wait_time = between(2, 5)
def on_start(self):
self.client.headers.update({
"Authorization": "Bearer reporting-load-test-token",
"Accept": "application/json",
"Content-Type": "application/json"
})
@task
def generate_sales_report(self):
payload = {
"report_type": "sales_summary",
"date_range": {
"from": "2025-01-01",
"to": "2025-01-31"
},
"filters": {
"region": random.choice(["us-east", "us-west", "eu-central"]),
"channel": random.choice(["web", "mobile", "partner"])
},
"format": "csv"
}
with self.client.post("/api/v1/reports", json=payload, name="POST /reports", catch_response=True) as response:
if response.status_code != 202:
response.failure(f"Report job not accepted: {response.status_code}")
return
job_id = response.json().get("job_id")
if not job_id:
response.failure("Missing job_id in report creation response")
return
for _ in range(5):
time.sleep(2)
with self.client.get(f"/api/v1/reports/{job_id}", name="GET /reports/:job_id", catch_response=True) as poll_response:
if poll_response.status_code != 200:
poll_response.failure(f"Polling failed for job {job_id}: {poll_response.status_code}")
return
job_data = poll_response.json()
status = job_data.get("status")
if status == "completed":
download_url = job_data.get("download_url")
if not download_url:
poll_response.failure("Completed report missing download_url")
return
if status == "failed":
poll_response.failure(f"Report job {job_id} failed")
returnWhy asynchronous workflows need special treatment
If you only load test the initial POST /reports endpoint, you may miss the true backend cost. Polling and job completion behavior often reveal issues in:
- Worker queues
- Background job processors
- Export generation services
- Object storage integration
- Database reads over large datasets
This is a great use case for LoadForge because you can scale traffic gradually and observe not just request latency, but how the whole API system behaves as queued work accumulates.
Analyzing Your Results
After running a REST API load test in LoadForge, don’t just look at average response time. Averages can hide serious performance issues.
Focus on percentile latency
For REST APIs, p95 and p99 latency are often more meaningful than the average. For example:
- Average latency: 120ms
- p95 latency: 850ms
- p99 latency: 2200ms
This suggests most requests are fast, but a significant minority are slow enough to hurt user experience or trigger client timeouts.
Compare endpoints separately
Group results by endpoint name, such as:
GET /productsGET /products/:idPOST /auth/loginPOST /tasksGET /reports/:job_id
This helps you pinpoint which API operations degrade first. In many systems, write endpoints and search endpoints become bottlenecks before simple reads.
Watch error patterns
Different error codes tell different stories:
- 400/422: Bad request or validation issues in the test script
- 401/403: Authentication or authorization problems
- 404: Missing test data or invalid IDs
- 429: Rate limiting or throttling
- 500/502/503/504: Server-side instability or downstream dependency failures
A rising 5xx rate under load is a strong signal that the API or its dependencies are hitting capacity limits.
Correlate with backend metrics
LoadForge gives you the traffic-side view, but you should also correlate with application monitoring:
- CPU and memory utilization
- Database query times
- Connection pool usage
- Cache hit rates
- Queue depth
- Error logs
If latency spikes coincide with database saturation or worker backlog growth, you’ve likely found the real bottleneck.
Evaluate scalability
A REST API scales well when increasing traffic produces roughly proportional throughput without dramatic latency growth. Warning signs include:
- Throughput flattening while users increase
- Rapid p95/p99 degradation
- Error rates rising sharply after a concurrency threshold
- Login or write endpoints slowing much faster than read endpoints
With LoadForge’s distributed testing and real-time reporting, you can run multiple scenarios and compare how your API behaves at different user counts, ramp-up rates, and geographic locations.
Performance Optimization Tips
Once your REST API load testing reveals bottlenecks, these are some of the most effective optimization areas to investigate.
Optimize database access
- Add indexes for frequently filtered fields
- Reduce N+1 query patterns
- Use pagination consistently
- Avoid returning oversized result sets
- Tune connection pools for expected concurrency
Improve caching
- Cache common read endpoints
- Cache expensive computed responses
- Use CDN or edge caching for public API resources where appropriate
- Monitor cache hit/miss ratios during tests
Reduce payload size
- Limit fields in list endpoints
- Avoid deeply nested responses unless necessary
- Support filtering and sparse fieldsets
- Compress responses with gzip or Brotli
Make authentication efficient
- Use stateless JWT validation where appropriate
- Cache token introspection results if safe
- Reduce repeated auth-related database lookups
Handle asynchronous work properly
- Move expensive processing to background jobs
- Return
202 Acceptedfor long-running tasks - Scale worker pools independently from API servers
- Monitor queue latency under stress testing
Tune infrastructure
- Increase application worker counts where appropriate
- Review autoscaling thresholds
- Ensure load balancers and API gateways are not introducing bottlenecks
- Use LoadForge’s cloud-based infrastructure to test realistic regional traffic patterns before changing production capacity
Common Pitfalls to Avoid
Load testing REST APIs is straightforward in principle, but several common mistakes can make your results misleading.
Testing only one endpoint
A single endpoint test rarely reflects real traffic. Most APIs handle a mix of reads, writes, authentication, and background work. Build scenarios that resemble actual usage patterns.
Using unrealistic data
If every virtual user requests the same product ID or submits identical search terms, you may get artificially good cache performance. Use varied IDs, filters, and payloads.
Ignoring authentication flows
Protected APIs often spend significant time on auth and permission checks. If you skip login or token usage, your test may underestimate real-world latency.
Not validating responses
A 200 OK does not guarantee correctness. Always check response structure, expected fields, and business logic where possible.
Overloading the wrong environment
Load testing a developer sandbox with tiny infrastructure won’t tell you much about production readiness. Use an environment that resembles production in architecture and scale.
Forgetting test data cleanup
Write-heavy tests can create thousands of records. If you don’t clean up after tests, later runs may be polluted by stale or oversized datasets.
Ramping too aggressively
Jumping from 0 to 10,000 users instantly may create an unrealistic spike. In many cases, gradual ramp-up gives more useful insights into true capacity and failure points.
Misinterpreting rate limits
If your API gateway enforces throttling, 429 responses may be expected behavior rather than application failure. Make sure your test goals align with your rate-limit policies.
Conclusion
Load testing REST APIs is one of the most effective ways to improve reliability, scalability, and user experience before performance issues reach production. By testing realistic workflows like public reads, authenticated sessions, CRUD operations, and asynchronous jobs, you can uncover bottlenecks in your API layer, database, authentication system, and background workers.
With LoadForge, you can create realistic Locust-based REST API tests, run them at scale using distributed cloud-based infrastructure, monitor results in real time, and integrate performance testing into your CI/CD pipeline. If you want to measure latency, throughput, error rates, and scalability with confidence, now is the perfect time to try LoadForge and start load testing your REST APIs.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

How to Load Test API Rate Limiting with LoadForge
Test API rate limiting with LoadForge to verify throttling rules, retry behavior, and service stability during traffic spikes.

Load Testing API Gateways with LoadForge
Discover how to load test API gateways with LoadForge to measure routing performance, latency, and resilience under heavy traffic.

Load Testing GraphQL APIs with LoadForge
Discover how to load test GraphQL APIs with LoadForge, including queries, mutations, concurrency, and performance bottlenecks.