Introduction

REST APIs are the backbone of modern applications. Whether you’re powering a mobile app, a single-page web application, third-party integrations, or internal microservices, your API needs to respond quickly and reliably under real-world traffic. A REST API that performs well with a handful of users can still struggle when hundreds or thousands of clients make concurrent requests, submit large payloads, or trigger expensive backend operations.

That’s why load testing REST APIs is essential. With effective load testing, performance testing, and stress testing, you can measure latency, throughput, error rates, and scalability before production traffic exposes weaknesses. You can identify bottlenecks in authentication flows, database-backed endpoints, search APIs, reporting jobs, and write-heavy operations.

LoadForge makes this process much easier by combining the flexibility of Locust with cloud-based infrastructure, distributed testing, real-time reporting, global test locations, and CI/CD integration. In this guide, you’ll learn how to build realistic REST API load tests using Locust scripts in LoadForge, starting with a basic read-only test and progressing to more advanced scenarios like authenticated workflows, CRUD operations, and asynchronous job polling.

Prerequisites

Before you start load testing a REST API with LoadForge, make sure you have the following:

A REST API environment to test
- Preferably a staging or pre-production environment
- Avoid load testing production unless you have explicit safeguards in place
API documentation
- OpenAPI/Swagger specs are especially helpful
- You should know endpoint paths, request bodies, headers, and expected status codes
Test credentials
- API keys, bearer tokens, OAuth client credentials, or test user accounts
Representative test data
- Product IDs, user accounts, order records, search terms, or other realistic payload inputs
Performance goals
- Example: p95 latency under 300ms for GET endpoints
- Example: error rate below 1% at 500 requests per second
A LoadForge account
- So you can run distributed load tests, monitor real-time results, and compare runs over time

It also helps to decide what type of test you want to run:

Load testing: Validate expected traffic levels
Stress testing: Push beyond expected capacity to find breaking points
Spike testing: Simulate sudden bursts of traffic
Endurance testing: Run sustained traffic over time to detect memory leaks or resource exhaustion

Understanding REST APIs Under Load

REST APIs can appear simple on the surface, but their behavior under load depends on several layers of the stack. A single GET /api/products request might involve authentication, rate limiting, cache lookups, database queries, serialization, and network overhead. When concurrency rises, these layers can become bottlenecks.

Common REST API bottlenecks

Authentication and authorization

JWT validation, OAuth token introspection, and permission checks can add measurable overhead. If every request requires expensive auth validation, performance can degrade quickly.

Database contention

Endpoints that create, update, or search records often hit relational or NoSQL databases. Under load, you may see:

Slow queries
Connection pool exhaustion
Lock contention
Increased transaction times

Serialization and payload size

JSON encoding and decoding is usually fast, but large nested payloads can increase CPU usage and response times, especially for list or reporting endpoints.

External service dependencies

REST APIs often call payment gateways, email providers, search engines, or internal microservices. These downstream dependencies can fail or slow down under traffic, causing cascading latency.

Caching behavior

Read-heavy APIs may perform well when cache hit rates are high, but degrade badly on cache misses. Load testing should include enough variation to expose realistic cache patterns.

Rate limiting and throttling

Some APIs intentionally reject or delay excess traffic. This is not always a failure, but you need to understand when 429 responses are expected versus when they indicate misconfiguration.

What to measure during REST API performance testing

When you load test REST APIs, focus on these core metrics:

Response time
- Average, median, p95, and p99 latency
Throughput
- Requests per second
Error rate
- 4xx and 5xx responses
Concurrency
- Number of active users or requests in flight
Endpoint-specific performance
- Compare login, search, create, update, and delete operations separately
Infrastructure behavior
- CPU, memory, DB utilization, and connection pool usage

LoadForge’s real-time reporting helps you see these metrics as the test runs, making it easier to spot endpoint-level regressions and capacity limits.

Writing Your First Load Test

A good first REST API load test usually targets read-only endpoints. These are common, easy to validate, and often represent the majority of production traffic.

Let’s assume you have an e-commerce API with the following endpoints:

GET /api/v1/health
GET /api/v1/products
GET /api/v1/products/{id}
GET /api/v1/categories

This first Locust script simulates users browsing product data.

python

from locust import HttpUser, task, between
import random
 
 
class RestApiBrowserUser(HttpUser):
    wait_time = between(1, 3)
 
    product_ids = [101, 102, 103, 104, 105]
    category_slugs = ["electronics", "books", "home", "fitness"]
 
    def on_start(self):
        self.client.headers.update({
            "Accept": "application/json",
            "User-Agent": "LoadForge-REST-API-Test/1.0"
        })
 
    @task(1)
    def health_check(self):
        with self.client.get("/api/v1/health", name="GET /health", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Health check failed: {response.status_code}")
 
    @task(5)
    def list_products(self):
        params = {
            "page": random.randint(1, 5),
            "limit": 20,
            "sort": "popularity"
        }
        with self.client.get("/api/v1/products", params=params, name="GET /products", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Unexpected status code: {response.status_code}")
                return
 
            data = response.json()
            if "items" not in data or not isinstance(data["items"], list):
                response.failure("Missing or invalid 'items' array")
 
    @task(3)
    def get_product_detail(self):
        product_id = random.choice(self.product_ids)
        with self.client.get(f"/api/v1/products/{product_id}", name="GET /products/:id", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Failed to fetch product {product_id}")
                return
 
            data = response.json()
            if "id" not in data or data["id"] != product_id:
                response.failure("Product response did not match requested ID")
 
    @task(2)
    def list_category_products(self):
        category = random.choice(self.category_slugs)
        params = {"category": category, "limit": 12}
        with self.client.get("/api/v1/products", params=params, name="GET /products?category", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Category listing failed for {category}")

What this script does

This script models a simple browsing user:

It checks API health
Lists products with pagination
Fetches individual product details
Filters products by category

The task weights make product listing more frequent than health checks, which is more realistic. It also validates response structure instead of only checking status codes.

Why this is a good starting point

For REST API load testing, this type of script helps you answer basic but important questions:

Can the API serve read traffic at expected concurrency?
Are list endpoints slower than detail endpoints?
Do paginated endpoints degrade with larger datasets?
Are there errors or malformed responses under load?

In LoadForge, you can run this script across distributed generators to simulate traffic from multiple regions and observe how latency and throughput change as user counts increase.

Advanced Load Testing Scenarios

Once you’ve validated basic read traffic, the next step is to simulate more realistic API behavior. Most REST APIs involve authentication, writes, filtering, and asynchronous processing.

Authenticated API load testing with JWT tokens

A common REST API pattern is logging in with email and password, receiving a bearer token, and then using that token for protected endpoints.

python

from locust import HttpUser, task, between
import random
 
 
class AuthenticatedApiUser(HttpUser):
    wait_time = between(1, 2)
 
    users = [
        {"email": "qa.user1@example.com", "password": "TestPass123!"},
        {"email": "qa.user2@example.com", "password": "TestPass123!"},
        {"email": "qa.user3@example.com", "password": "TestPass123!"},
    ]
 
    def on_start(self):
        credentials = random.choice(self.users)
 
        login_payload = {
            "email": credentials["email"],
            "password": credentials["password"]
        }
 
        with self.client.post(
            "/api/v1/auth/login",
            json=login_payload,
            headers={"Content-Type": "application/json", "Accept": "application/json"},
            name="POST /auth/login",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Login failed for {credentials['email']}: {response.status_code}")
                return
 
            data = response.json()
            token = data.get("access_token")
            if not token:
                response.failure("No access token returned")
                return
 
            self.client.headers.update({
                "Authorization": f"Bearer {token}",
                "Accept": "application/json",
                "Content-Type": "application/json"
            })
 
    @task(4)
    def get_profile(self):
        with self.client.get("/api/v1/users/me", name="GET /users/me", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Profile fetch failed: {response.status_code}")
 
    @task(3)
    def list_orders(self):
        params = {
            "status": random.choice(["pending", "shipped", "delivered"]),
            "limit": 10
        }
        with self.client.get("/api/v1/orders", params=params, name="GET /orders", catch_response=True) as response:
            if response.status_code != 200:
                response.failure(f"Order listing failed: {response.status_code}")
 
    @task(2)
    def get_order_detail(self):
        order_id = random.choice([5001, 5002, 5003, 5004])
        with self.client.get(f"/api/v1/orders/{order_id}", name="GET /orders/:id", catch_response=True) as response:
            if response.status_code not in [200, 404]:
                response.failure(f"Unexpected status for order {order_id}: {response.status_code}")

Why this scenario matters

Authenticated traffic often behaves differently from public traffic because it adds:

Login overhead
Token generation or validation
User-specific database queries
Permission checks

This type of performance testing is especially important for SaaS applications, customer portals, and mobile backends.

Testing CRUD operations with realistic write traffic

Read-heavy traffic is only part of the picture. Many APIs also need to handle cart updates, order creation, profile updates, or ticket submissions. Write endpoints often expose database bottlenecks more quickly than reads.

Here’s a realistic test for a task management REST API.

python

from locust import HttpUser, task, between
import random
import uuid
 
 
class TaskApiUser(HttpUser):
    wait_time = between(1, 3)
 
    def on_start(self):
        auth_payload = {
            "email": "loadtest.manager@example.com",
            "password": "SecurePass456!"
        }
 
        response = self.client.post(
            "/api/v1/auth/login",
            json=auth_payload,
            headers={"Content-Type": "application/json", "Accept": "application/json"}
        )
 
        if response.status_code == 200:
            token = response.json().get("access_token")
            self.client.headers.update({
                "Authorization": f"Bearer {token}",
                "Accept": "application/json",
                "Content-Type": "application/json"
            })
 
    @task(4)
    def list_tasks(self):
        params = {
            "project_id": random.choice([2001, 2002, 2003]),
            "status": random.choice(["open", "in_progress", "done"]),
            "limit": 25
        }
        self.client.get("/api/v1/tasks", params=params, name="GET /tasks")
 
    @task(2)
    def create_task(self):
        task_id = str(uuid.uuid4())[:8]
        payload = {
            "project_id": random.choice([2001, 2002, 2003]),
            "title": f"Load test task {task_id}",
            "description": "Created during REST API performance testing with LoadForge",
            "priority": random.choice(["low", "medium", "high"]),
            "assignee_id": random.choice([301, 302, 303]),
            "tags": ["load-test", "api", "locust"]
        }
 
        with self.client.post("/api/v1/tasks", json=payload, name="POST /tasks", catch_response=True) as response:
            if response.status_code != 201:
                response.failure(f"Task creation failed: {response.status_code}")
                return
 
            data = response.json()
            created_id = data.get("id")
            if not created_id:
                response.failure("Created task response missing ID")
                return
 
            self.created_task_id = created_id
 
    @task(1)
    def update_task(self):
        task_id = getattr(self, "created_task_id", None)
        if not task_id:
            return
 
        payload = {
            "status": random.choice(["in_progress", "done"]),
            "priority": random.choice(["medium", "high"])
        }
 
        with self.client.patch(f"/api/v1/tasks/{task_id}", json=payload, name="PATCH /tasks/:id", catch_response=True) as response:
            if response.status_code not in [200, 204]:
                response.failure(f"Task update failed: {response.status_code}")
 
    @task(1)
    def delete_task(self):
        task_id = getattr(self, "created_task_id", None)
        if not task_id:
            return
 
        with self.client.delete(f"/api/v1/tasks/{task_id}", name="DELETE /tasks/:id", catch_response=True) as response:
            if response.status_code not in [200, 204]:
                response.failure(f"Task deletion failed: {response.status_code}")
            else:
                self.created_task_id = None

What this test reveals

This script is useful for stress testing and performance testing write-heavy endpoints because it simulates:

Listing tasks with filters
Creating records
Updating records
Deleting records

This can reveal issues like:

Slow insert or update queries
Transaction lock contention
Insufficient database indexing
Connection pool exhaustion
Increased latency as write concurrency grows

Testing asynchronous REST workflows

Many modern REST APIs return a job ID for expensive operations such as report generation, exports, media processing, or bulk imports. These workflows should be load tested differently from simple request-response endpoints.

Below is a script for a reporting API that creates an export job and polls for completion.

python

from locust import HttpUser, task, between
import random
import time
 
 
class ReportingApiUser(HttpUser):
    wait_time = between(2, 5)
 
    def on_start(self):
        self.client.headers.update({
            "Authorization": "Bearer reporting-load-test-token",
            "Accept": "application/json",
            "Content-Type": "application/json"
        })
 
    @task
    def generate_sales_report(self):
        payload = {
            "report_type": "sales_summary",
            "date_range": {
                "from": "2025-01-01",
                "to": "2025-01-31"
            },
            "filters": {
                "region": random.choice(["us-east", "us-west", "eu-central"]),
                "channel": random.choice(["web", "mobile", "partner"])
            },
            "format": "csv"
        }
 
        with self.client.post("/api/v1/reports", json=payload, name="POST /reports", catch_response=True) as response:
            if response.status_code != 202:
                response.failure(f"Report job not accepted: {response.status_code}")
                return
 
            job_id = response.json().get("job_id")
            if not job_id:
                response.failure("Missing job_id in report creation response")
                return
 
        for _ in range(5):
            time.sleep(2)
 
            with self.client.get(f"/api/v1/reports/{job_id}", name="GET /reports/:job_id", catch_response=True) as poll_response:
                if poll_response.status_code != 200:
                    poll_response.failure(f"Polling failed for job {job_id}: {poll_response.status_code}")
                    return
 
                job_data = poll_response.json()
                status = job_data.get("status")
 
                if status == "completed":
                    download_url = job_data.get("download_url")
                    if not download_url:
                        poll_response.failure("Completed report missing download_url")
                    return
 
                if status == "failed":
                    poll_response.failure(f"Report job {job_id} failed")
                    return

Why asynchronous workflows need special treatment

If you only load test the initial POST /reports endpoint, you may miss the true backend cost. Polling and job completion behavior often reveal issues in:

Worker queues
Background job processors
Export generation services
Object storage integration
Database reads over large datasets

This is a great use case for LoadForge because you can scale traffic gradually and observe not just request latency, but how the whole API system behaves as queued work accumulates.

Analyzing Your Results

After running a REST API load test in LoadForge, don’t just look at average response time. Averages can hide serious performance issues.

Focus on percentile latency

For REST APIs, p95 and p99 latency are often more meaningful than the average. For example:

Average latency: 120ms
p95 latency: 850ms
p99 latency: 2200ms

This suggests most requests are fast, but a significant minority are slow enough to hurt user experience or trigger client timeouts.

Compare endpoints separately

Group results by endpoint name, such as:

GET /products
GET /products/:id
POST /auth/login
POST /tasks
GET /reports/:job_id

This helps you pinpoint which API operations degrade first. In many systems, write endpoints and search endpoints become bottlenecks before simple reads.

Watch error patterns

Different error codes tell different stories:

400/422: Bad request or validation issues in the test script
401/403: Authentication or authorization problems
404: Missing test data or invalid IDs
429: Rate limiting or throttling
500/502/503/504: Server-side instability or downstream dependency failures

A rising 5xx rate under load is a strong signal that the API or its dependencies are hitting capacity limits.

Correlate with backend metrics

LoadForge gives you the traffic-side view, but you should also correlate with application monitoring:

CPU and memory utilization
Database query times
Connection pool usage
Cache hit rates
Queue depth
Error logs

If latency spikes coincide with database saturation or worker backlog growth, you’ve likely found the real bottleneck.

Evaluate scalability

A REST API scales well when increasing traffic produces roughly proportional throughput without dramatic latency growth. Warning signs include:

Throughput flattening while users increase
Rapid p95/p99 degradation
Error rates rising sharply after a concurrency threshold
Login or write endpoints slowing much faster than read endpoints

With LoadForge’s distributed testing and real-time reporting, you can run multiple scenarios and compare how your API behaves at different user counts, ramp-up rates, and geographic locations.

Performance Optimization Tips

Once your REST API load testing reveals bottlenecks, these are some of the most effective optimization areas to investigate.

Optimize database access

Add indexes for frequently filtered fields
Reduce N+1 query patterns
Use pagination consistently
Avoid returning oversized result sets
Tune connection pools for expected concurrency

Improve caching

Cache common read endpoints
Cache expensive computed responses
Use CDN or edge caching for public API resources where appropriate
Monitor cache hit/miss ratios during tests

Reduce payload size

Limit fields in list endpoints
Avoid deeply nested responses unless necessary
Support filtering and sparse fieldsets
Compress responses with gzip or Brotli

Make authentication efficient

Use stateless JWT validation where appropriate
Cache token introspection results if safe
Reduce repeated auth-related database lookups

Handle asynchronous work properly

Move expensive processing to background jobs
Return 202 Accepted for long-running tasks
Scale worker pools independently from API servers
Monitor queue latency under stress testing

Tune infrastructure

Increase application worker counts where appropriate
Review autoscaling thresholds
Ensure load balancers and API gateways are not introducing bottlenecks
Use LoadForge’s cloud-based infrastructure to test realistic regional traffic patterns before changing production capacity

Common Pitfalls to Avoid

Load testing REST APIs is straightforward in principle, but several common mistakes can make your results misleading.

Testing only one endpoint

A single endpoint test rarely reflects real traffic. Most APIs handle a mix of reads, writes, authentication, and background work. Build scenarios that resemble actual usage patterns.

Using unrealistic data

If every virtual user requests the same product ID or submits identical search terms, you may get artificially good cache performance. Use varied IDs, filters, and payloads.

Ignoring authentication flows

Protected APIs often spend significant time on auth and permission checks. If you skip login or token usage, your test may underestimate real-world latency.

Not validating responses

A 200 OK does not guarantee correctness. Always check response structure, expected fields, and business logic where possible.

Overloading the wrong environment

Load testing a developer sandbox with tiny infrastructure won’t tell you much about production readiness. Use an environment that resembles production in architecture and scale.

Forgetting test data cleanup

Write-heavy tests can create thousands of records. If you don’t clean up after tests, later runs may be polluted by stale or oversized datasets.

Ramping too aggressively

Jumping from 0 to 10,000 users instantly may create an unrealistic spike. In many cases, gradual ramp-up gives more useful insights into true capacity and failure points.

Misinterpreting rate limits

If your API gateway enforces throttling, 429 responses may be expected behavior rather than application failure. Make sure your test goals align with your rate-limit policies.

Conclusion

Load testing REST APIs is one of the most effective ways to improve reliability, scalability, and user experience before performance issues reach production. By testing realistic workflows like public reads, authenticated sessions, CRUD operations, and asynchronous jobs, you can uncover bottlenecks in your API layer, database, authentication system, and background workers.

With LoadForge, you can create realistic Locust-based REST API tests, run them at scale using distributed cloud-based infrastructure, monitor results in real time, and integrate performance testing into your CI/CD pipeline. If you want to measure latency, throughput, error rates, and scalability with confidence, now is the perfect time to try LoadForge and start load testing your REST APIs.

Load Testing REST APIs with LoadForge

Introduction

Prerequisites

Understanding REST APIs Under Load

Common REST API bottlenecks

Authentication and authorization

Database contention

Serialization and payload size

External service dependencies

Caching behavior

Rate limiting and throttling

What to measure during REST API performance testing

Writing Your First Load Test

What this script does

Why this is a good starting point

Advanced Load Testing Scenarios

Authenticated API load testing with JWT tokens

Why this scenario matters

Testing CRUD operations with realistic write traffic

What this test reveals

Testing asynchronous REST workflows

Why asynchronous workflows need special treatment

Analyzing Your Results

Focus on percentile latency

Compare endpoints separately

Watch error patterns

Correlate with backend metrics

Evaluate scalability

Performance Optimization Tips

Optimize database access

Improve caching

Reduce payload size

Make authentication efficient

Handle asynchronous work properly

Tune infrastructure

Common Pitfalls to Avoid

Testing only one endpoint

Using unrealistic data

Ignoring authentication flows

Not validating responses

Overloading the wrong environment

Forgetting test data cleanup

Ramping too aggressively

Misinterpreting rate limits

Conclusion

Try LoadForge free for 7 days

Related guides

How to Load Test API Rate Limiting with LoadForge

Load Testing API Gateways with LoadForge

Load Testing GraphQL APIs with LoadForge