LoadForge LogoLoadForge

How to Enforce Performance Budgets with Load Testing

How to Enforce Performance Budgets with Load Testing

Introduction

Performance budgets are one of the most effective ways to prevent gradual slowdowns from reaching production. In a modern CI/CD pipeline, teams often validate functionality, run unit tests, scan for vulnerabilities, and deploy automatically—but performance regressions still slip through because they are not enforced with the same discipline as other quality gates.

That is where load testing becomes essential. By defining clear latency and error-rate thresholds for critical user journeys, you can stop releases that exceed acceptable performance limits before customers feel the impact. With LoadForge, you can run cloud-based load testing as part of your delivery pipeline, execute distributed tests from global test locations, and review real-time reporting to determine whether a build should pass or fail.

In this guide, you will learn how to enforce performance budgets with load testing in CI/CD using Locust-based scripts on LoadForge. We will cover how performance budgets work under load, how to write realistic tests for key application flows, how to analyze the results, and how to turn those results into actionable release gates.

Prerequisites

Before you start, make sure you have:

  • A LoadForge account
  • A web application or API with stable test or staging endpoints
  • A list of critical transactions to protect with performance budgets
  • Authentication credentials for your test environment
  • A CI/CD platform such as GitHub Actions, GitLab CI, Jenkins, or CircleCI
  • Agreement from your team on acceptable thresholds, such as:
    • p95 latency under 500 ms for login
    • p95 latency under 800 ms for product search
    • error rate below 1%
    • checkout completion under 2 seconds at expected concurrency

You should also identify which endpoints matter most to the business. Good candidates include:

  • /api/v1/auth/login
  • /api/v1/products/search
  • /api/v1/cart/items
  • /api/v1/checkout/submit
  • /api/v1/reports/daily-sales

The goal is not to load test every endpoint equally. Instead, focus your performance testing on the transactions that define user experience and operational risk.

Understanding Performance Budgets Under Load

A performance budget is a measurable limit that your application must stay within. In the context of load testing and CI/CD, budgets usually cover:

  • Response time thresholds, such as p95 or p99 latency
  • Maximum error rate
  • Throughput targets
  • Resource-specific expectations for key workflows

When traffic increases, applications often fail in predictable ways:

  • Authentication services become slow due to token generation or database lookups
  • Search endpoints degrade because of inefficient queries or cache misses
  • Cart and checkout APIs suffer from lock contention or downstream payment latency
  • Reporting endpoints trigger expensive database operations
  • File or JSON-heavy payloads increase serialization overhead

These issues may not appear in single-user testing. They show up when concurrent users hit the same system at once, which is why stress testing and load testing are critical.

For CI/CD, the most useful approach is to define budgets around realistic, repeatable traffic patterns. For example:

  • At 50 concurrent users, login p95 must remain below 400 ms
  • At 100 concurrent users, product search p95 must remain below 700 ms
  • At 25 checkout requests per minute, error rate must remain below 0.5%

LoadForge is well-suited for this because it gives you distributed testing, cloud-based infrastructure, and real-time reporting that can be tied directly into deployment decisions.

Writing Your First Load Test

Let’s start with a basic Locust script that tests a login flow and a health check endpoint. This is a simple but practical first step for enforcing a performance budget in CI/CD.

Basic CI/CD budget validation test

python
from locust import HttpUser, task, between
import os
 
class PerformanceBudgetUser(HttpUser):
    wait_time = between(1, 3)
 
    def on_start(self):
        self.email = os.getenv("TEST_USER_EMAIL", "qa.user@example.com")
        self.password = os.getenv("TEST_USER_PASSWORD", "SuperSecurePassword123!")
 
    @task(3)
    def health_check(self):
        with self.client.get(
            "/health",
            name="GET /health",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Health check failed with status {response.status_code}")
 
    @task(1)
    def login(self):
        payload = {
            "email": self.email,
            "password": self.password,
            "rememberMe": False
        }
 
        headers = {
            "Content-Type": "application/json",
            "X-Client-Version": "ci-budget-check-1.0"
        }
 
        with self.client.post(
            "/api/v1/auth/login",
            json=payload,
            headers=headers,
            name="POST /api/v1/auth/login",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Login failed with status {response.status_code}")
                return
 
            data = response.json()
            if "accessToken" not in data:
                response.failure("Login response missing accessToken")

This script is useful for a basic release gate because it validates:

  • Application responsiveness through /health
  • Authentication performance through /api/v1/auth/login
  • Correctness of the response, not just HTTP status

In LoadForge, you can configure user count, spawn rate, and runtime to simulate your expected CI validation load. For example, a quick pipeline check might use:

  • 20 users
  • 5 users/sec spawn rate
  • 3-minute runtime

That is enough to catch obvious regressions without making the pipeline too slow.

Example CI/CD invocation

If you run Locust headless in automation, the command might look like this:

bash
locust -f locustfile.py \
  --host=https://staging.example-store.com \
  --headless \
  --users 20 \
  --spawn-rate 5 \
  --run-time 3m

In LoadForge, you would typically configure these settings in the platform and trigger the test from your CI/CD workflow rather than manually running Locust yourself.

Advanced Load Testing Scenarios

Once your basic performance budget checks are working, you should expand them to cover the user journeys that matter most. Below are more realistic scenarios for CI/CD and DevOps teams enforcing release gates.

Scenario 1: Authenticated browsing with search performance budgets

Many teams care less about homepage speed than they do about search and authenticated browsing. This script logs in, stores the bearer token, and tests a realistic search endpoint with filters and pagination.

python
from locust import HttpUser, task, between
import os
import random
 
class AuthenticatedSearchUser(HttpUser):
    wait_time = between(1, 2)
 
    def on_start(self):
        self.email = os.getenv("TEST_USER_EMAIL", "qa.user@example.com")
        self.password = os.getenv("TEST_USER_PASSWORD", "SuperSecurePassword123!")
        self.access_token = None
        self.login()
 
    def login(self):
        payload = {
            "email": self.email,
            "password": self.password
        }
 
        response = self.client.post(
            "/api/v1/auth/login",
            json=payload,
            headers={"Content-Type": "application/json"},
            name="POST /api/v1/auth/login"
        )
 
        if response.status_code == 200:
            self.access_token = response.json().get("accessToken")
 
    @task(4)
    def search_products(self):
        if not self.access_token:
            self.login()
            if not self.access_token:
                return
 
        search_terms = ["laptop", "wireless mouse", "usb-c hub", "monitor", "keyboard"]
        categories = ["electronics", "accessories", "office"]
        sort_options = ["relevance", "price_asc", "rating"]
 
        params = {
            "q": random.choice(search_terms),
            "category": random.choice(categories),
            "sort": random.choice(sort_options),
            "page": random.randint(1, 3),
            "pageSize": 20,
            "inStock": "true"
        }
 
        headers = {
            "Authorization": f"Bearer {self.access_token}",
            "Accept": "application/json"
        }
 
        with self.client.get(
            "/api/v1/products/search",
            params=params,
            headers=headers,
            name="GET /api/v1/products/search",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Search failed with status {response.status_code}")
                return
 
            data = response.json()
            if "items" not in data:
                response.failure("Search response missing items array")

This type of test is ideal for performance budgets such as:

  • p95 search latency under 700 ms
  • error rate below 1%
  • no authentication failures under moderate load

Because it uses realistic query parameters and bearer token authentication, it is much closer to production behavior than a simple static endpoint test.

Scenario 2: Cart and checkout workflow budget enforcement

If you want to stop releases that hurt revenue, protect the cart and checkout flow. This example simulates a user adding an item to a cart and submitting checkout.

python
from locust import HttpUser, task, between, SequentialTaskSet
import os
import random
import uuid
 
class CheckoutFlow(SequentialTaskSet):
    def on_start(self):
        self.user.access_token = None
        self.user.cart_id = None
        self.login()
 
    def login(self):
        payload = {
            "email": os.getenv("CHECKOUT_USER_EMAIL", "buyer@example.com"),
            "password": os.getenv("CHECKOUT_USER_PASSWORD", "CheckoutPassword123!")
        }
 
        response = self.client.post(
            "/api/v1/auth/login",
            json=payload,
            headers={"Content-Type": "application/json"},
            name="POST /api/v1/auth/login"
        )
 
        if response.status_code == 200:
            self.user.access_token = response.json().get("accessToken")
 
    @task
    def create_cart(self):
        headers = {
            "Authorization": f"Bearer {self.user.access_token}",
            "Content-Type": "application/json"
        }
 
        with self.client.post(
            "/api/v1/cart",
            json={"currency": "USD"},
            headers=headers,
            name="POST /api/v1/cart",
            catch_response=True
        ) as response:
            if response.status_code != 201:
                response.failure(f"Cart creation failed: {response.status_code}")
                return
 
            self.user.cart_id = response.json().get("cartId")
            if not self.user.cart_id:
                response.failure("Missing cartId in create cart response")
 
    @task
    def add_item_to_cart(self):
        product_ids = ["sku-100245", "sku-100312", "sku-100489"]
        headers = {
            "Authorization": f"Bearer {self.user.access_token}",
            "Content-Type": "application/json"
        }
 
        payload = {
            "productId": random.choice(product_ids),
            "quantity": random.randint(1, 2)
        }
 
        with self.client.post(
            f"/api/v1/cart/{self.user.cart_id}/items",
            json=payload,
            headers=headers,
            name="POST /api/v1/cart/{cartId}/items",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Add to cart failed: {response.status_code}")
 
    @task
    def submit_checkout(self):
        headers = {
            "Authorization": f"Bearer {self.user.access_token}",
            "Content-Type": "application/json",
            "Idempotency-Key": str(uuid.uuid4())
        }
 
        payload = {
            "cartId": self.user.cart_id,
            "shippingAddress": {
                "firstName": "QA",
                "lastName": "User",
                "address1": "123 Test Street",
                "city": "Austin",
                "state": "TX",
                "postalCode": "78701",
                "country": "US"
            },
            "paymentMethod": {
                "type": "test_card",
                "cardToken": "tok_visa_4242"
            },
            "shippingMethod": "standard"
        }
 
        with self.client.post(
            "/api/v1/checkout/submit",
            json=payload,
            headers=headers,
            name="POST /api/v1/checkout/submit",
            catch_response=True
        ) as response:
            if response.status_code not in [200, 201]:
                response.failure(f"Checkout failed: {response.status_code}")
                return
 
            data = response.json()
            if "orderId" not in data:
                response.failure("Checkout response missing orderId")
                return
 
        self.interrupt()
 
class CheckoutUser(HttpUser):
    wait_time = between(1, 2)
    tasks = [CheckoutFlow]

This is a strong candidate for release gating because checkout performance directly affects conversion. You might enforce budgets like:

  • p95 add-to-cart latency under 500 ms
  • p95 checkout latency under 1500 ms
  • checkout error rate below 0.5%

Scenario 3: Database-heavy reporting endpoint with custom latency budget checks

Some releases do not break customer-facing pages, but they degrade internal dashboards or analytics APIs. Those regressions still matter, especially for operations teams and enterprise customers.

This example tests a reporting endpoint with date filters and validates that responses stay under a strict latency budget.

python
from locust import HttpUser, task, between, events
import os
import time
 
REPORT_BUDGET_MS = 1200
 
class ReportingUser(HttpUser):
    wait_time = between(2, 5)
 
    def on_start(self):
        self.access_token = None
        self.login()
 
    def login(self):
        payload = {
            "email": os.getenv("REPORT_USER_EMAIL", "analyst@example.com"),
            "password": os.getenv("REPORT_USER_PASSWORD", "ReportingPassword123!")
        }
 
        response = self.client.post(
            "/api/v1/auth/login",
            json=payload,
            headers={"Content-Type": "application/json"},
            name="POST /api/v1/auth/login"
        )
 
        if response.status_code == 200:
            self.access_token = response.json().get("accessToken")
 
    @task
    def fetch_daily_sales_report(self):
        if not self.access_token:
            self.login()
            if not self.access_token:
                return
 
        headers = {
            "Authorization": f"Bearer {self.access_token}",
            "Accept": "application/json"
        }
 
        params = {
            "startDate": "2026-04-01",
            "endDate": "2026-04-06",
            "region": "north-america",
            "groupBy": "day",
            "includeRefunds": "true"
        }
 
        start = time.perf_counter()
 
        with self.client.get(
            "/api/v1/reports/daily-sales",
            params=params,
            headers=headers,
            name="GET /api/v1/reports/daily-sales",
            catch_response=True
        ) as response:
            elapsed_ms = (time.perf_counter() - start) * 1000
 
            if response.status_code != 200:
                response.failure(f"Report request failed: {response.status_code}")
                return
 
            if elapsed_ms > REPORT_BUDGET_MS:
                response.failure(
                    f"Report exceeded budget: {elapsed_ms:.2f} ms > {REPORT_BUDGET_MS} ms"
                )
                return
 
            data = response.json()
            if "totals" not in data or "rows" not in data:
                response.failure("Invalid report payload structure")

This pattern is useful when you want the test itself to mark requests as failures if they exceed a hard budget, even if the server technically returns HTTP 200.

Analyzing Your Results

After running your test in LoadForge, focus on a few key metrics rather than trying to interpret everything at once.

Response time percentiles

Average latency can hide serious issues. Use percentiles instead:

  • p50 shows typical performance
  • p95 shows the experience of slower users
  • p99 reveals tail latency and instability

For performance budgets, p95 is often the best release gate metric. If your budget says checkout p95 must be under 1500 ms, a result of 2100 ms should block the release even if the average is acceptable.

Error rate

Track both HTTP errors and logical failures from your Locust assertions. A response with status 200 can still be a failure if:

  • the token is missing
  • the cart ID is not returned
  • the payload is malformed
  • the response exceeds a defined budget

This is why catch_response=True is so valuable in Locust-based load testing.

Throughput and request distribution

Make sure your system is handling the intended request volume. If throughput is lower than expected, you may be saturating an application tier, database, or external dependency.

Endpoint-specific regressions

A release may improve one endpoint while degrading another. LoadForge’s real-time reporting helps you isolate which transaction names are failing, such as:

  • POST /api/v1/auth/login
  • GET /api/v1/products/search
  • POST /api/v1/checkout/submit

This is especially important when using CI/CD integration, because you need clear pass/fail criteria tied to specific business-critical transactions.

Trend comparisons across builds

The real value of performance budgets comes from consistency over time. Compare current test runs to previous builds and look for:

  • rising p95 latency
  • growing error rates
  • reduced throughput at the same concurrency
  • increased variance in response times

With LoadForge, teams can centralize these results and use cloud-based infrastructure to keep test execution consistent across environments.

Performance Optimization Tips

When your load testing reveals budget violations, these are common fixes:

Optimize authentication paths

  • Cache session or token validation where appropriate
  • Reduce unnecessary user profile lookups during login
  • Review password hashing configuration for staging realism

Improve search performance

  • Add or tune database indexes
  • Cache common search queries
  • Avoid expensive wildcard or multi-join queries
  • Limit payload size for list endpoints

Streamline cart and checkout

  • Reduce synchronous calls to payment, tax, or shipping providers
  • Use queues for non-critical post-checkout work
  • Ensure cart tables are indexed for user and cart ID lookups
  • Watch for lock contention on inventory updates

Tune reporting endpoints

  • Pre-aggregate data where possible
  • Use read replicas for analytics traffic
  • Paginate large result sets
  • Move expensive computations out of request time

Right-size your infrastructure

If budgets fail only at higher concurrency, you may need:

  • more application instances
  • better database connection pooling
  • autoscaling adjustments
  • CDN or edge caching for repeated requests

LoadForge’s distributed testing can help verify whether these improvements work under realistic traffic from multiple regions.

Common Pitfalls to Avoid

Using unrealistic test data

If every simulated user searches for the exact same term or checks out the same item, your caches may make the system look faster than it really is. Use varied but controlled data.

Testing only happy paths

A real system handles expired tokens, empty carts, and large result sets. Include realistic edge cases in your performance testing strategy.

Relying on averages

Average response time is not a reliable performance budget metric. Always gate on p95 or p99 for critical flows.

Ignoring downstream dependencies

Your application may be fast, but a payment provider, identity service, or database replica may be the real bottleneck. Design tests that expose dependency-related slowdowns.

Running tests that are too small

A 2-user smoke test is not enough to enforce meaningful budgets. Even in CI/CD, use enough concurrency to reveal contention and queuing.

Making the pipeline too slow

At the same time, avoid turning every commit into a 45-minute stress testing job. A good pattern is:

  • quick budget checks on every pull request
  • medium load tests on merges to main
  • larger stress testing runs nightly or before release

Not defining pass/fail criteria in advance

If the team debates acceptable latency after every test run, the budget is not enforceable. Define thresholds before adding them to the pipeline.

Conclusion

Performance budgets turn load testing from an occasional exercise into an enforceable quality standard. Instead of discovering regressions after deployment, you can catch them in CI/CD and stop releases that exceed latency or error-rate thresholds.

With Locust-based scripting, realistic user flows, and clear budget definitions, you can protect login, search, checkout, reporting, and other critical transactions with confidence. LoadForge makes this process easier with distributed testing, real-time reporting, CI/CD integration, cloud-based infrastructure, and global test locations that help teams validate performance at scale.

If you are ready to make performance a true release gate, try LoadForge and start enforcing performance budgets in your pipeline today.

Try LoadForge free for 7 days

Set up your first load test in under 2 minutes. No commitment.