
Introduction
Cloudflare Workers are designed to run code at the edge, close to your users, which makes them a powerful choice for APIs, request routing, personalization, authentication gateways, and lightweight backend logic. But edge computing does not eliminate the need for load testing. In fact, because Cloudflare Workers often sit on the critical path for every request, performance testing becomes even more important.
A Cloudflare Workers load testing strategy helps you answer practical questions:
- How does your Worker behave under global traffic spikes?
- What latency do users see from different regions?
- Does your Worker hit CPU, subrequest, or external API bottlenecks?
- How does authentication, caching, or KV-backed logic perform under sustained load?
- Can your Worker maintain low response times during stress testing?
In this guide, you’ll learn how to load test Cloudflare Workers using LoadForge and Locust. We’ll cover basic endpoint testing, authenticated API traffic, cache-aware scenarios, and Worker flows that depend on external services or Cloudflare KV-like access patterns. Along the way, we’ll show realistic Python scripts you can run on LoadForge’s cloud-based infrastructure, using distributed testing and global test locations to simulate real-world edge traffic.
Prerequisites
Before you start load testing Cloudflare Workers, make sure you have:
- A deployed Cloudflare Worker endpoint, such as:
https://api.example.workers.dev- or a custom domain like
https://edge-api.example.com
- Access to the Worker routes you want to test
- Any required API keys, bearer tokens, or signed headers
- A LoadForge account
- Basic familiarity with:
- HTTP methods and status codes
- Cloudflare Workers routing
- Locust test scripts in Python
It also helps to know whether your Worker uses:
- Cloudflare KV
- Durable Objects
- R2
- Cache API
- External origin fetches
- Third-party APIs
- JWT or API token authentication
That context will help you design realistic load testing and performance testing scenarios instead of only testing a simple hello-world route.
Understanding Cloudflare Workers Under Load
Cloudflare Workers execute in a distributed edge environment, which changes how you should think about performance testing compared to a traditional monolithic app.
What makes Cloudflare Workers different
Workers are optimized for:
- Fast startup
- Execution near end users
- Lightweight request handling
- Massive concurrency across a global network
However, Workers still have practical limits and bottlenecks.
Common bottlenecks in Cloudflare Workers
External fetch latency
Many Workers call upstream APIs, origin servers, or microservices. Even if the Worker itself is fast, total response time may be dominated by:
- slow origin APIs
- TLS handshake overhead
- regional differences
- upstream rate limits
CPU-intensive logic
Workers are excellent for lightweight compute, but heavy JSON transformations, encryption, image manipulation, or large payload processing can increase execution time under load.
KV and Durable Object access
If your Worker reads feature flags, sessions, user preferences, or counters from KV or Durable Objects, performance can vary depending on:
- read/write patterns
- eventual consistency expectations
- hot key contention
- object serialization overhead
Cache behavior
A Worker may be fast on cache hits and much slower on cache misses. Your load testing should reflect both paths.
Authentication overhead
JWT validation, HMAC signatures, token introspection, or calls to identity providers can become a hidden latency source under stress testing.
Metrics to watch when load testing Cloudflare Workers
When running a Cloudflare Workers load test, focus on:
- Requests per second
- Average response time
- P95 and P99 latency
- Error rate
- Timeouts
- Response size
- Latency by region
- Performance differences between cached and uncached routes
LoadForge is especially useful here because you can run distributed testing from multiple regions and compare behavior across geographies with real-time reporting.
Writing Your First Load Test
Let’s start with a simple Cloudflare Worker that exposes a health endpoint and a lightweight API route.
Assume your Worker has these routes:
GET /healthGET /api/v1/configGET /api/v1/products?region=us
This first test validates availability and baseline latency.
Basic Cloudflare Workers load test
from locust import HttpUser, task, between
class CloudflareWorkerUser(HttpUser):
wait_time = between(1, 3)
@task(3)
def health_check(self):
with self.client.get("/health", name="GET /health", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Unexpected status code: {response.status_code}")
@task(2)
def get_config(self):
with self.client.get("/api/v1/config", name="GET /api/v1/config", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Config endpoint failed: {response.status_code}")
elif "environment" not in response.text:
response.failure("Missing expected config content")
@task(5)
def list_products(self):
params = {"region": "us"}
with self.client.get("/api/v1/products", params=params, name="GET /api/v1/products", catch_response=True) as response:
if response.status_code != 200:
response.failure(f"Products request failed: {response.status_code}")
else:
data = response.json()
if "products" not in data:
response.failure("Missing products array in response")How to use this script
In LoadForge, set the host to your Worker domain, for example:
https://api.example.workers.devThis test gives you a baseline for:
- basic route responsiveness
- Worker availability
- simple JSON response validation
Why this matters
A basic test is useful for catching:
- route misconfigurations
- Worker deployment issues
- unexpected 5xx errors
- regressions in edge execution latency
Before moving to advanced scenarios, confirm that your Worker can sustain expected traffic on simple routes.
Advanced Load Testing Scenarios
Real Cloudflare Workers often do more than return static JSON. They authenticate requests, fetch upstream data, personalize responses, and process writes. The following scenarios are more realistic for production performance testing.
Scenario 1: Authenticated API traffic with bearer tokens
A common Cloudflare Workers pattern is to place authentication and API routing at the edge. Suppose your Worker exposes:
POST /api/v1/auth/loginGET /api/v1/user/profileGET /api/v1/user/orders?limit=20
This script logs in once per user and uses the token for subsequent requests.
from locust import HttpUser, task, between
class AuthenticatedWorkerUser(HttpUser):
wait_time = between(1, 2)
token = None
def on_start(self):
payload = {
"email": "loadtest.user@example.com",
"password": "SuperSecure123!"
}
with self.client.post(
"/api/v1/auth/login",
json=payload,
name="POST /api/v1/auth/login",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Login failed: {response.status_code}")
return
data = response.json()
self.token = data.get("access_token")
if not self.token:
response.failure("No access_token returned")
@task(3)
def get_profile(self):
headers = {"Authorization": f"Bearer {self.token}"}
with self.client.get(
"/api/v1/user/profile",
headers=headers,
name="GET /api/v1/user/profile",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Profile request failed: {response.status_code}")
@task(5)
def get_orders(self):
headers = {"Authorization": f"Bearer {self.token}"}
params = {"limit": 20}
with self.client.get(
"/api/v1/user/orders",
headers=headers,
params=params,
name="GET /api/v1/user/orders",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Orders request failed: {response.status_code}")
else:
data = response.json()
if "orders" not in data:
response.failure("Missing orders in response")What this test reveals
This authenticated Cloudflare Workers load testing scenario helps you evaluate:
- token issuance latency
- auth middleware overhead
- JWT validation cost
- performance of user-specific routes
- impact of upstream identity or session checks
If your Worker integrates with an auth provider or validates signed tokens on every request, this is where performance bottlenecks often appear.
Scenario 2: Cache-aware edge API testing
Cloudflare Workers are frequently used to implement edge caching. For example, your Worker may proxy product catalog data and cache responses by region and category.
Routes:
GET /api/v1/catalog?region=us&category=shoesGET /api/v1/catalog?region=eu&category=jacketsGET /api/v1/catalog?region=apac&category=accessories
This script varies request parameters to simulate a realistic cache hit/miss mix.
from locust import HttpUser, task, between
import random
class CachedCatalogWorkerUser(HttpUser):
wait_time = between(0.5, 1.5)
regions = ["us", "eu", "apac"]
categories = ["shoes", "jackets", "accessories", "bags", "watches"]
@task(8)
def get_catalog(self):
region = random.choice(self.regions)
category = random.choice(self.categories)
headers = {
"Accept": "application/json",
"CF-Device-Type": random.choice(["desktop", "mobile"])
}
params = {
"region": region,
"category": category,
"limit": 24
}
with self.client.get(
"/api/v1/catalog",
params=params,
headers=headers,
name="GET /api/v1/catalog",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Catalog request failed: {response.status_code}")
return
try:
data = response.json()
if "items" not in data:
response.failure("Missing items in catalog response")
except Exception as e:
response.failure(f"Invalid JSON response: {e}")
@task(2)
def warm_common_catalog(self):
params = {
"region": "us",
"category": "shoes",
"limit": 24
}
with self.client.get(
"/api/v1/catalog",
params=params,
name="GET /api/v1/catalog (hot cache)",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Hot cache request failed: {response.status_code}")Why this scenario matters
This kind of performance testing helps you compare:
- hot-cache versus cold-cache response times
- regional variability
- query parameter explosion effects
- cache key design issues
- Worker logic overhead before and after cache lookup
If your P95 latency is low for hot requests but spikes on randomized requests, you may have a cache efficiency problem rather than a Worker execution problem.
Scenario 3: Write-heavy API with idempotency and upstream processing
Not all Cloudflare Workers are read-heavy. Many act as edge gateways for forms, checkout requests, event ingestion, or webhook processing.
Assume your Worker exposes:
POST /api/v1/events/ingestPOST /api/v1/orders/checkoutGET /api/v1/orders/status/{order_id}
This example simulates a realistic event ingestion and checkout workflow with idempotency headers.
from locust import HttpUser, task, between
import random
import uuid
class TransactionalWorkerUser(HttpUser):
wait_time = between(1, 2)
@task(6)
def ingest_event(self):
event_id = str(uuid.uuid4())
payload = {
"event_id": event_id,
"event_type": random.choice(["page_view", "add_to_cart", "checkout_started"]),
"user_id": f"user_{random.randint(1000, 9999)}",
"session_id": str(uuid.uuid4()),
"timestamp": "2026-04-06T12:00:00Z",
"properties": {
"path": random.choice(["/products/sku-1001", "/cart", "/checkout"]),
"region": random.choice(["us", "eu", "apac"]),
"device": random.choice(["mobile", "desktop"])
}
}
headers = {
"Content-Type": "application/json",
"X-API-Key": "lf_demo_worker_ingest_key"
}
with self.client.post(
"/api/v1/events/ingest",
json=payload,
headers=headers,
name="POST /api/v1/events/ingest",
catch_response=True
) as response:
if response.status_code not in [200, 202]:
response.failure(f"Event ingest failed: {response.status_code}")
@task(2)
def checkout_order(self):
idempotency_key = str(uuid.uuid4())
payload = {
"customer_id": f"cust_{random.randint(10000, 99999)}",
"currency": "USD",
"items": [
{"sku": "sku-1001", "quantity": 1, "unit_price": 79.99},
{"sku": "sku-2045", "quantity": 2, "unit_price": 24.50}
],
"shipping_address": {
"country": "US",
"postal_code": "94107",
"city": "San Francisco"
}
}
headers = {
"Authorization": "Bearer demo_checkout_token",
"Idempotency-Key": idempotency_key,
"Content-Type": "application/json"
}
with self.client.post(
"/api/v1/orders/checkout",
json=payload,
headers=headers,
name="POST /api/v1/orders/checkout",
catch_response=True
) as response:
if response.status_code not in [200, 201, 202]:
response.failure(f"Checkout failed: {response.status_code}")
return
try:
data = response.json()
order_id = data.get("order_id")
if not order_id:
response.failure("Checkout response missing order_id")
return
self.client.get(
f"/api/v1/orders/status/{order_id}",
headers={"Authorization": "Bearer demo_checkout_token"},
name="GET /api/v1/orders/status"
)
except Exception as e:
response.failure(f"Invalid checkout response: {e}")What this test is good for
This stress testing scenario is useful when your Worker:
- validates and transforms request bodies
- forwards writes to upstream APIs
- uses queues or async processing
- enforces idempotency
- handles payment or order orchestration at the edge
This is often where you discover that the Worker itself is fine, but the upstream service behind it cannot keep up.
Analyzing Your Results
Once your Cloudflare Workers load test is running in LoadForge, focus on the metrics that best reflect edge performance.
Response time percentiles
Average latency is useful, but percentiles tell the real story.
Look closely at:
- P50 for typical user experience
- P95 for degraded but common slowdowns
- P99 for tail latency and edge-case performance
A Worker with a 70 ms average but 900 ms P99 may still feel unreliable to many users.
Error rates
Track:
- 4xx responses from auth or validation issues
- 5xx responses from Worker exceptions
- 502/503/504 responses from upstream failures
- timeouts under high concurrency
If errors increase sharply after a certain user count, you may have found a scalability threshold.
Endpoint-level comparison
Separate metrics by route name:
GET /healthGET /api/v1/catalogPOST /api/v1/auth/loginPOST /api/v1/orders/checkout
This helps you identify whether the slowdown is isolated to:
- authenticated routes
- cache-miss paths
- write-heavy endpoints
- upstream-dependent operations
Regional insights
Cloudflare Workers are global by design, so load testing from one location is not enough. Use LoadForge’s global test locations and distributed testing to compare:
- North America latency
- Europe latency
- Asia-Pacific latency
If one region performs significantly worse, the issue may be:
- origin placement
- third-party API geography
- cache locality
- DNS or TLS overhead
- regional routing differences
Throughput versus latency
As concurrency rises, note when:
- requests per second flatten
- latency starts climbing sharply
- errors begin to appear
That inflection point is often your practical capacity limit for the current Worker design and upstream architecture.
LoadForge’s real-time reporting makes it easier to observe these changes while the test is running, rather than waiting until the end.
Performance Optimization Tips
If your Cloudflare Workers load testing reveals bottlenecks, these are the first areas to review.
Reduce upstream calls
Every external fetch adds latency and increases failure risk. Where possible:
- cache upstream responses
- batch requests
- avoid duplicate fetches per request
- precompute common responses
Optimize cache strategy
Make sure your cache keys are not overly fragmented. Too many query parameter combinations can destroy cache hit rates.
Review:
- cache key normalization
- TTL values
- region-specific caching
- whether personalized content is bypassing cache unnecessarily
Keep Worker logic lightweight
Minimize:
- large object transformations
- unnecessary JSON parsing/serialization
- repeated crypto operations
- expensive regex or string processing
Reuse authentication work when possible
If your Worker performs token introspection or remote auth checks, consider:
- local JWT validation
- caching auth metadata briefly
- reducing repeated identity lookups
Tune payload sizes
Large request and response bodies increase edge processing time and network transfer costs. Compress or trim unnecessary fields where possible.
Test with realistic traffic patterns
A good load testing plan includes:
- normal traffic
- burst traffic
- sustained traffic
- geographically distributed traffic
- cache-hit and cache-miss mixes
LoadForge is especially helpful here because you can model realistic traffic patterns with cloud-based infrastructure and integrate tests into CI/CD pipelines to catch regressions before deployment.
Common Pitfalls to Avoid
Cloudflare Workers performance testing is straightforward in principle, but teams often make a few avoidable mistakes.
Testing only a single route
A /health endpoint is useful, but it does not represent production behavior. Include real business-critical routes.
Ignoring authentication
Auth logic can add significant overhead. If your real users authenticate, your load test should too.
Not modeling cache behavior
Testing only hot cache or only cold cache gives an incomplete picture. Include both.
Forgetting upstream dependencies
A Worker may appear slow when the real bottleneck is:
- your origin API
- your auth provider
- a database-backed service
- a third-party API
Design tests that help isolate these dependencies.
Using unrealistic payloads
Tiny mock payloads can hide performance issues. Use realistic request bodies, headers, and query parameters.
Running from one geography only
Cloudflare Workers are edge-native, so global performance matters. Use multiple regions to understand real user experience.
Overlooking rate limits and protections
Cloudflare security rules, WAF settings, or API rate limits may affect test traffic. Make sure your load test environment is configured appropriately so you measure application performance, not blocked traffic.
Conclusion
Cloudflare Workers can deliver excellent edge performance, but only if you validate how they behave under real traffic conditions. Effective load testing helps you understand latency, throughput, cache efficiency, authentication overhead, and upstream dependency limits before users feel the impact.
With LoadForge, you can run realistic Cloudflare Workers load testing at scale using Locust-based scripts, distributed testing, global test locations, real-time reporting, and CI/CD integration. That makes it much easier to move from guesswork to measurable performance testing and stress testing.
If you’re ready to validate your Cloudflare Workers under production-like load, try LoadForge and start building tests that reflect how your edge applications actually run.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

Apache Load Testing Guide with LoadForge
Load test Apache web servers with LoadForge to benchmark request handling, concurrency, and overall site performance.

AWS Load Testing Guide with LoadForge
Learn how to load test AWS applications and APIs with LoadForge to find bottlenecks, measure scale, and improve performance.

Azure Functions Load Testing Guide
Load test Azure Functions with LoadForge to evaluate cold starts, throughput, and scaling behavior under peak demand.