
Introduction
Performance budgets are one of the most effective ways to prevent gradual slowdowns from reaching production. In a modern CI/CD pipeline, teams often validate functionality, run unit tests, scan for vulnerabilities, and deploy automatically—but performance regressions still slip through because they are not enforced with the same discipline as other quality gates.
That is where load testing becomes essential. By defining clear latency and error-rate thresholds for critical user journeys, you can stop releases that exceed acceptable performance limits before customers feel the impact. With LoadForge, you can run cloud-based load testing as part of your delivery pipeline, execute distributed tests from global test locations, and review real-time reporting to determine whether a build should pass or fail.
In this guide, you will learn how to enforce performance budgets with load testing in CI/CD using Locust-based scripts on LoadForge. We will cover how performance budgets work under load, how to write realistic tests for key application flows, how to analyze the results, and how to turn those results into actionable release gates.
Prerequisites
Before you start, make sure you have:
- A LoadForge account
- A web application or API with stable test or staging endpoints
- A list of critical transactions to protect with performance budgets
- Authentication credentials for your test environment
- A CI/CD platform such as GitHub Actions, GitLab CI, Jenkins, or CircleCI
- Agreement from your team on acceptable thresholds, such as:
- p95 latency under 500 ms for login
- p95 latency under 800 ms for product search
- error rate below 1%
- checkout completion under 2 seconds at expected concurrency
You should also identify which endpoints matter most to the business. Good candidates include:
/api/v1/auth/login/api/v1/products/search/api/v1/cart/items/api/v1/checkout/submit/api/v1/reports/daily-sales
The goal is not to load test every endpoint equally. Instead, focus your performance testing on the transactions that define user experience and operational risk.
Understanding Performance Budgets Under Load
A performance budget is a measurable limit that your application must stay within. In the context of load testing and CI/CD, budgets usually cover:
- Response time thresholds, such as p95 or p99 latency
- Maximum error rate
- Throughput targets
- Resource-specific expectations for key workflows
When traffic increases, applications often fail in predictable ways:
- Authentication services become slow due to token generation or database lookups
- Search endpoints degrade because of inefficient queries or cache misses
- Cart and checkout APIs suffer from lock contention or downstream payment latency
- Reporting endpoints trigger expensive database operations
- File or JSON-heavy payloads increase serialization overhead
These issues may not appear in single-user testing. They show up when concurrent users hit the same system at once, which is why stress testing and load testing are critical.
For CI/CD, the most useful approach is to define budgets around realistic, repeatable traffic patterns. For example:
- At 50 concurrent users, login p95 must remain below 400 ms
- At 100 concurrent users, product search p95 must remain below 700 ms
- At 25 checkout requests per minute, error rate must remain below 0.5%
LoadForge is well-suited for this because it gives you distributed testing, cloud-based infrastructure, and real-time reporting that can be tied directly into deployment decisions.
Writing Your First Load Test
Let’s start with a basic Locust script that tests a login flow and a health check endpoint. This is a simple but practical first step for enforcing a performance budget in CI/CD.
Basic CI/CD budget validation test
from locust import HttpUser, task, between
import os
class PerformanceBudgetUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
self.email = os.getenv("TEST_USER_EMAIL", "qa.user@example.com")
self.password = os.getenv("TEST_USER_PASSWORD", "SuperSecurePassword123!")
@task(3)
def health_check(self):
with self.client.get(
"/health",
name="GET /health",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Health check failed with status {response.status_code}")
@task(1)
def login(self):
payload = {
"email": self.email,
"password": self.password,
"rememberMe": False
}
headers = {
"Content-Type": "application/json",
"X-Client-Version": "ci-budget-check-1.0"
}
with self.client.post(
"/api/v1/auth/login",
json=payload,
headers=headers,
name="POST /api/v1/auth/login",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Login failed with status {response.status_code}")
return
data = response.json()
if "accessToken" not in data:
response.failure("Login response missing accessToken")This script is useful for a basic release gate because it validates:
- Application responsiveness through
/health - Authentication performance through
/api/v1/auth/login - Correctness of the response, not just HTTP status
In LoadForge, you can configure user count, spawn rate, and runtime to simulate your expected CI validation load. For example, a quick pipeline check might use:
- 20 users
- 5 users/sec spawn rate
- 3-minute runtime
That is enough to catch obvious regressions without making the pipeline too slow.
Example CI/CD invocation
If you run Locust headless in automation, the command might look like this:
locust -f locustfile.py \
--host=https://staging.example-store.com \
--headless \
--users 20 \
--spawn-rate 5 \
--run-time 3mIn LoadForge, you would typically configure these settings in the platform and trigger the test from your CI/CD workflow rather than manually running Locust yourself.
Advanced Load Testing Scenarios
Once your basic performance budget checks are working, you should expand them to cover the user journeys that matter most. Below are more realistic scenarios for CI/CD and DevOps teams enforcing release gates.
Scenario 1: Authenticated browsing with search performance budgets
Many teams care less about homepage speed than they do about search and authenticated browsing. This script logs in, stores the bearer token, and tests a realistic search endpoint with filters and pagination.
from locust import HttpUser, task, between
import os
import random
class AuthenticatedSearchUser(HttpUser):
wait_time = between(1, 2)
def on_start(self):
self.email = os.getenv("TEST_USER_EMAIL", "qa.user@example.com")
self.password = os.getenv("TEST_USER_PASSWORD", "SuperSecurePassword123!")
self.access_token = None
self.login()
def login(self):
payload = {
"email": self.email,
"password": self.password
}
response = self.client.post(
"/api/v1/auth/login",
json=payload,
headers={"Content-Type": "application/json"},
name="POST /api/v1/auth/login"
)
if response.status_code == 200:
self.access_token = response.json().get("accessToken")
@task(4)
def search_products(self):
if not self.access_token:
self.login()
if not self.access_token:
return
search_terms = ["laptop", "wireless mouse", "usb-c hub", "monitor", "keyboard"]
categories = ["electronics", "accessories", "office"]
sort_options = ["relevance", "price_asc", "rating"]
params = {
"q": random.choice(search_terms),
"category": random.choice(categories),
"sort": random.choice(sort_options),
"page": random.randint(1, 3),
"pageSize": 20,
"inStock": "true"
}
headers = {
"Authorization": f"Bearer {self.access_token}",
"Accept": "application/json"
}
with self.client.get(
"/api/v1/products/search",
params=params,
headers=headers,
name="GET /api/v1/products/search",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Search failed with status {response.status_code}")
return
data = response.json()
if "items" not in data:
response.failure("Search response missing items array")This type of test is ideal for performance budgets such as:
- p95 search latency under 700 ms
- error rate below 1%
- no authentication failures under moderate load
Because it uses realistic query parameters and bearer token authentication, it is much closer to production behavior than a simple static endpoint test.
Scenario 2: Cart and checkout workflow budget enforcement
If you want to stop releases that hurt revenue, protect the cart and checkout flow. This example simulates a user adding an item to a cart and submitting checkout.
from locust import HttpUser, task, between, SequentialTaskSet
import os
import random
import uuid
class CheckoutFlow(SequentialTaskSet):
def on_start(self):
self.user.access_token = None
self.user.cart_id = None
self.login()
def login(self):
payload = {
"email": os.getenv("CHECKOUT_USER_EMAIL", "buyer@example.com"),
"password": os.getenv("CHECKOUT_USER_PASSWORD", "CheckoutPassword123!")
}
response = self.client.post(
"/api/v1/auth/login",
json=payload,
headers={"Content-Type": "application/json"},
name="POST /api/v1/auth/login"
)
if response.status_code == 200:
self.user.access_token = response.json().get("accessToken")
@task
def create_cart(self):
headers = {
"Authorization": f"Bearer {self.user.access_token}",
"Content-Type": "application/json"
}
with self.client.post(
"/api/v1/cart",
json={"currency": "USD"},
headers=headers,
name="POST /api/v1/cart",
catch_response=True
) as response:
if response.status_code != 201:
response.failure(f"Cart creation failed: {response.status_code}")
return
self.user.cart_id = response.json().get("cartId")
if not self.user.cart_id:
response.failure("Missing cartId in create cart response")
@task
def add_item_to_cart(self):
product_ids = ["sku-100245", "sku-100312", "sku-100489"]
headers = {
"Authorization": f"Bearer {self.user.access_token}",
"Content-Type": "application/json"
}
payload = {
"productId": random.choice(product_ids),
"quantity": random.randint(1, 2)
}
with self.client.post(
f"/api/v1/cart/{self.user.cart_id}/items",
json=payload,
headers=headers,
name="POST /api/v1/cart/{cartId}/items",
catch_response=True
) as response:
if response.status_code != 200:
response.failure(f"Add to cart failed: {response.status_code}")
@task
def submit_checkout(self):
headers = {
"Authorization": f"Bearer {self.user.access_token}",
"Content-Type": "application/json",
"Idempotency-Key": str(uuid.uuid4())
}
payload = {
"cartId": self.user.cart_id,
"shippingAddress": {
"firstName": "QA",
"lastName": "User",
"address1": "123 Test Street",
"city": "Austin",
"state": "TX",
"postalCode": "78701",
"country": "US"
},
"paymentMethod": {
"type": "test_card",
"cardToken": "tok_visa_4242"
},
"shippingMethod": "standard"
}
with self.client.post(
"/api/v1/checkout/submit",
json=payload,
headers=headers,
name="POST /api/v1/checkout/submit",
catch_response=True
) as response:
if response.status_code not in [200, 201]:
response.failure(f"Checkout failed: {response.status_code}")
return
data = response.json()
if "orderId" not in data:
response.failure("Checkout response missing orderId")
return
self.interrupt()
class CheckoutUser(HttpUser):
wait_time = between(1, 2)
tasks = [CheckoutFlow]This is a strong candidate for release gating because checkout performance directly affects conversion. You might enforce budgets like:
- p95 add-to-cart latency under 500 ms
- p95 checkout latency under 1500 ms
- checkout error rate below 0.5%
Scenario 3: Database-heavy reporting endpoint with custom latency budget checks
Some releases do not break customer-facing pages, but they degrade internal dashboards or analytics APIs. Those regressions still matter, especially for operations teams and enterprise customers.
This example tests a reporting endpoint with date filters and validates that responses stay under a strict latency budget.
from locust import HttpUser, task, between, events
import os
import time
REPORT_BUDGET_MS = 1200
class ReportingUser(HttpUser):
wait_time = between(2, 5)
def on_start(self):
self.access_token = None
self.login()
def login(self):
payload = {
"email": os.getenv("REPORT_USER_EMAIL", "analyst@example.com"),
"password": os.getenv("REPORT_USER_PASSWORD", "ReportingPassword123!")
}
response = self.client.post(
"/api/v1/auth/login",
json=payload,
headers={"Content-Type": "application/json"},
name="POST /api/v1/auth/login"
)
if response.status_code == 200:
self.access_token = response.json().get("accessToken")
@task
def fetch_daily_sales_report(self):
if not self.access_token:
self.login()
if not self.access_token:
return
headers = {
"Authorization": f"Bearer {self.access_token}",
"Accept": "application/json"
}
params = {
"startDate": "2026-04-01",
"endDate": "2026-04-06",
"region": "north-america",
"groupBy": "day",
"includeRefunds": "true"
}
start = time.perf_counter()
with self.client.get(
"/api/v1/reports/daily-sales",
params=params,
headers=headers,
name="GET /api/v1/reports/daily-sales",
catch_response=True
) as response:
elapsed_ms = (time.perf_counter() - start) * 1000
if response.status_code != 200:
response.failure(f"Report request failed: {response.status_code}")
return
if elapsed_ms > REPORT_BUDGET_MS:
response.failure(
f"Report exceeded budget: {elapsed_ms:.2f} ms > {REPORT_BUDGET_MS} ms"
)
return
data = response.json()
if "totals" not in data or "rows" not in data:
response.failure("Invalid report payload structure")This pattern is useful when you want the test itself to mark requests as failures if they exceed a hard budget, even if the server technically returns HTTP 200.
Analyzing Your Results
After running your test in LoadForge, focus on a few key metrics rather than trying to interpret everything at once.
Response time percentiles
Average latency can hide serious issues. Use percentiles instead:
- p50 shows typical performance
- p95 shows the experience of slower users
- p99 reveals tail latency and instability
For performance budgets, p95 is often the best release gate metric. If your budget says checkout p95 must be under 1500 ms, a result of 2100 ms should block the release even if the average is acceptable.
Error rate
Track both HTTP errors and logical failures from your Locust assertions. A response with status 200 can still be a failure if:
- the token is missing
- the cart ID is not returned
- the payload is malformed
- the response exceeds a defined budget
This is why catch_response=True is so valuable in Locust-based load testing.
Throughput and request distribution
Make sure your system is handling the intended request volume. If throughput is lower than expected, you may be saturating an application tier, database, or external dependency.
Endpoint-specific regressions
A release may improve one endpoint while degrading another. LoadForge’s real-time reporting helps you isolate which transaction names are failing, such as:
POST /api/v1/auth/loginGET /api/v1/products/searchPOST /api/v1/checkout/submit
This is especially important when using CI/CD integration, because you need clear pass/fail criteria tied to specific business-critical transactions.
Trend comparisons across builds
The real value of performance budgets comes from consistency over time. Compare current test runs to previous builds and look for:
- rising p95 latency
- growing error rates
- reduced throughput at the same concurrency
- increased variance in response times
With LoadForge, teams can centralize these results and use cloud-based infrastructure to keep test execution consistent across environments.
Performance Optimization Tips
When your load testing reveals budget violations, these are common fixes:
Optimize authentication paths
- Cache session or token validation where appropriate
- Reduce unnecessary user profile lookups during login
- Review password hashing configuration for staging realism
Improve search performance
- Add or tune database indexes
- Cache common search queries
- Avoid expensive wildcard or multi-join queries
- Limit payload size for list endpoints
Streamline cart and checkout
- Reduce synchronous calls to payment, tax, or shipping providers
- Use queues for non-critical post-checkout work
- Ensure cart tables are indexed for user and cart ID lookups
- Watch for lock contention on inventory updates
Tune reporting endpoints
- Pre-aggregate data where possible
- Use read replicas for analytics traffic
- Paginate large result sets
- Move expensive computations out of request time
Right-size your infrastructure
If budgets fail only at higher concurrency, you may need:
- more application instances
- better database connection pooling
- autoscaling adjustments
- CDN or edge caching for repeated requests
LoadForge’s distributed testing can help verify whether these improvements work under realistic traffic from multiple regions.
Common Pitfalls to Avoid
Using unrealistic test data
If every simulated user searches for the exact same term or checks out the same item, your caches may make the system look faster than it really is. Use varied but controlled data.
Testing only happy paths
A real system handles expired tokens, empty carts, and large result sets. Include realistic edge cases in your performance testing strategy.
Relying on averages
Average response time is not a reliable performance budget metric. Always gate on p95 or p99 for critical flows.
Ignoring downstream dependencies
Your application may be fast, but a payment provider, identity service, or database replica may be the real bottleneck. Design tests that expose dependency-related slowdowns.
Running tests that are too small
A 2-user smoke test is not enough to enforce meaningful budgets. Even in CI/CD, use enough concurrency to reveal contention and queuing.
Making the pipeline too slow
At the same time, avoid turning every commit into a 45-minute stress testing job. A good pattern is:
- quick budget checks on every pull request
- medium load tests on merges to main
- larger stress testing runs nightly or before release
Not defining pass/fail criteria in advance
If the team debates acceptable latency after every test run, the budget is not enforceable. Define thresholds before adding them to the pipeline.
Conclusion
Performance budgets turn load testing from an occasional exercise into an enforceable quality standard. Instead of discovering regressions after deployment, you can catch them in CI/CD and stop releases that exceed latency or error-rate thresholds.
With Locust-based scripting, realistic user flows, and clear budget definitions, you can protect login, search, checkout, reporting, and other critical transactions with confidence. LoadForge makes this process easier with distributed testing, real-time reporting, CI/CD integration, cloud-based infrastructure, and global test locations that help teams validate performance at scale.
If you are ready to make performance a true release gate, try LoadForge and start enforcing performance budgets in your pipeline today.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

ArgoCD Load Testing for Progressive Delivery
Combine ArgoCD and LoadForge to validate app performance during progressive delivery and Kubernetes rollouts.

How to Automate Load Testing in CircleCI
Use LoadForge with CircleCI to automate load testing in CI/CD and detect bottlenecks before production.

Datadog Load Testing Integration with LoadForge
Integrate Datadog with LoadForge to correlate load test results with infrastructure and application metrics.