You have a web application. Users are starting to show up. But how do you know your site can handle the traffic when things get busy? That is exactly what load testing answers, and this tutorial will walk you through running your very first test from start to finish.

No prior experience required. By the end of this guide, you will have written a real load test script, executed it against your application, and understood what the results mean. Let's get started.

What You'll Learn

In this load testing tutorial, we will cover:

What load testing is and why every web application needs it
How to write a test script using Python and Locust
How to configure and run a test on LoadForge with the right settings
How to interpret your results so you can take meaningful action
Next steps to build load testing into your development workflow

Prerequisites

Before we begin, make sure you have the following ready:

A web application or API to test. This can be a staging environment, a development server, or even a simple API endpoint. You do not need to test against production right away.
A LoadForge account. A free tier is available, which is more than enough for this tutorial. Sign up at loadforge.com if you haven't already.
Basic familiarity with Python. This is helpful but absolutely not required. We will explain every line of code as we go, so you can follow along even if you have never written Python before.

Step 1: Understand the Basics

Before writing any code, let's make sure we understand what load testing actually does.

Load testing is the practice of simulating multiple users accessing your application at the same time. The goal is to see how your system behaves under stress. Does it stay fast? Does it slow down? Does it break entirely?

There are a few key terms you will encounter throughout this tutorial:

Virtual users are simulated users that your load test creates. Each one behaves like a real person browsing your site, making requests and waiting between actions.
Concurrent connections refers to how many virtual users are active at the same time. Fifty concurrent users means fifty people using your application simultaneously.
Requests per second (RPS) measures how many HTTP requests your application is handling every second. This is your throughput.
Response time is how long your server takes to respond to a single request. This is what your real users feel.

Think of it like this: imagine a retail store on a normal Tuesday afternoon. A few customers walk in, browse around, and check out without any issues. Now imagine Black Friday. Hundreds of people flood through the doors at once. The checkout lines back up. Staff can't keep up. Some customers leave frustrated.

Load testing lets you simulate that Black Friday crowd before it happens. You are looking for two things: does performance degrade under load, and at what point does it start to break down?

Step 2: Plan Your Test Scenario

A good load test mimics what real users actually do. Before you write any code, take a moment to think about your users' behavior.

Ask yourself:

What pages do they visit most often?
In what order do they typically navigate?
How long do they spend on each page before clicking something else?

For example, imagine you are testing an e-commerce site. A typical user journey might look like this:

Land on the homepage
Browse the product listing page
Click into a product detail page
Add an item to the cart

Not every user completes every step. Most people browse the homepage. Fewer make it to a product page. Even fewer add something to their cart. Your test should reflect these proportions.

One more thing to consider: think time. Real users do not click links instantly. They read content, scroll around, and pause before taking action. A realistic load test builds in small delays between requests to simulate this behavior. Without think time, your test hammers the server far harder than real traffic would, which gives you misleading results.

Step 3: Write Your Locustfile

Now for the hands-on part. LoadForge uses Locust, an open-source load testing framework written in Python. Your test script is called a locustfile, and it defines how your virtual users behave.

Here is a complete locustfile for the e-commerce scenario we planned above:

python

from locust import HttpUser, task, between
 
class WebsiteUser(HttpUser):
    wait_time = between(1, 3)
 
    @task(5)
    def view_homepage(self):
        self.client.get("/", name="Homepage")
 
    @task(3)
    def browse_products(self):
        self.client.get("/products", name="Product Listing")
 
    @task(2)
    def view_product(self):
        self.client.get("/products/example-product", name="Product Detail")
 
    @task(1)
    def add_to_cart(self):
        self.client.post("/cart/add", json={
            "product_id": "123",
            "quantity": 1
        }, name="Add to Cart")

Let's break this down line by line so you understand exactly what is happening.

from locust import HttpUser, task, between -- This imports the three building blocks we need from the Locust library. HttpUser is the base class for our virtual user. task is a decorator that marks a method as something the user does. between creates a random wait time within a range.

class WebsiteUser(HttpUser): -- This defines our virtual user. Every virtual user in the test will be an instance of this class. By inheriting from HttpUser, it gets built-in HTTP capabilities like making GET and POST requests.

wait_time = between(1, 3) -- This is our think time. After each task, the virtual user will pause for a random duration between 1 and 3 seconds before performing the next action. This simulates real human browsing behavior.

@task(5) -- The @task decorator marks a method as a user action. The number in parentheses is the weight. A weight of 5 means this task is five times more likely to be picked than a task with weight 1. In our script, the homepage gets the most traffic (weight 5), followed by product listing (weight 3), product detail (weight 2), and add to cart (weight 1). This matches the natural drop-off you see in a real user funnel.

self.client.get("/", name="Homepage") -- The self.client object is an HTTP client that makes requests to your target application. Here we are making a GET request to the root path. The name parameter is used for grouping in your results. Without it, each unique URL would show up as a separate entry in your report. By giving requests a name, you can group them logically.

self.client.post("/cart/add", json={...}, name="Add to Cart") -- This demonstrates a POST request with a JSON body. You can send any data your API expects, including form data, headers, authentication tokens, and more.

That is your entire test script. It is short, readable, and directly maps to real user behavior. When you paste this into LoadForge, the platform handles everything else: spinning up the infrastructure, distributing the load, and collecting the results.

Step 4: Configure Your Test

With your locustfile ready, the next step is configuring the test parameters in LoadForge. These settings determine the shape and intensity of your load test.

Number of Users

This is the total number of virtual users that will be active at the peak of your test. If you are new to load testing, start small. Begin with 10 to 50 users to get a baseline, then scale up in subsequent tests to find your limits.

Choosing the right number depends on your application. If your analytics show 200 concurrent users during peak hours, you might test with 200, then push to 400 or 500 to see how much headroom you have.

Ramp-Up Rate

The ramp-up rate (sometimes called spawn rate or hatch rate) controls how quickly virtual users are added to the test. A rate of 5 users per second means that every second, five new virtual users start making requests.

A gradual ramp-up is important. If you dump all 200 users onto your server in the first second, you are testing a traffic spike, not normal load growth. A ramp-up of 5 to 10 users per second gives your application time to warm up caches, open database connections, and behave the way it would under organic growth.

Test Duration

Short tests can be misleading. Memory leaks, connection pool exhaustion, and cache expiration issues only surface over time. Run your test for at least 5 minutes for a quick sanity check. For thorough testing, aim for 15 to 30 minutes. This gives you enough data to see trends and spot slow degradation that might not be obvious in a short run.

Test Regions

LoadForge lets you generate traffic from multiple geographic regions. Choose locations that are close to where your real users are. If most of your customers are in North America, run your test from US-based regions. If you have a global audience, distribute across multiple regions to see how latency varies.

Each of these settings matters. The number of users determines the intensity. The ramp-up rate determines the shape. The duration determines the depth. And the region determines the realism. Take a moment to think about what makes sense for your application before clicking the start button.

Step 5: Run Your Test

Once you have configured everything, hit the start button and watch your test come to life.

Here is what happens behind the scenes: LoadForge spins up test infrastructure in your chosen regions, deploys your locustfile, and begins spawning virtual users at the ramp-up rate you specified. Each user follows the behavior you defined, making requests, waiting, then making more requests.

During the test, the LoadForge dashboard shows you real-time metrics. Keep your eye on three things:

Response time chart. This shows how fast your server is responding over time. You want to see a flat, stable line. If it starts climbing, your application is struggling under the load.
Error rate. This tracks the percentage of requests that fail. A few errors might be normal, but a sudden spike is a red flag.
Throughput. This shows requests per second over time. Watch for a plateau, which means your server has hit its maximum capacity and cannot process requests any faster.

When to stop early: If your error rate shoots above 10 percent or your response times climb into the tens of seconds, you have already found a problem. There is no need to keep hammering a struggling server. Stop the test, investigate the issue, and re-test after making improvements.

Step 6: Read Your Results

The test is done. Now comes the most important part: understanding what the numbers mean. LoadForge presents your results across several key metrics. Let's walk through each one.

Response Times

You will see several response time values: average, median, p95, and p99. Do not focus only on the average. Averages hide outliers and can be deeply misleading.

The p95 (95th percentile) tells you the response time that 95 percent of requests were faster than. If your p95 is 2 seconds, that means 1 in 20 users experienced a wait of 2 seconds or more. The p99 is even more strict: 1 in 100 users.

Why do percentiles matter? Because your unhappiest users are the ones experiencing the worst response times, and those are the users most likely to leave. A 200ms average with a 5-second p99 means most people are happy, but a meaningful number are not.

As a general guideline: aim for a p95 under 1 second for web pages and under 500 milliseconds for API calls. These thresholds depend on your application, but they are a solid starting point.

Throughput

Throughput is measured in requests per second (RPS). During the ramp-up phase, you should see throughput climb steadily as more users come online. Eventually, it will flatten out. That flat line is your ceiling -- the maximum number of requests your infrastructure can handle per second.

If throughput plateaus while users are still being added, your server has maxed out. New requests are waiting in a queue, which is why response times start climbing at the same time.

Error Rate

Errors during a load test are normal in small quantities. Here is how to interpret them:

Below 1 percent: Acceptable. Your application is handling the load well.
Between 1 and 5 percent: Warning zone. Investigate which requests are failing and why.
Above 5 percent: Critical. Your application is struggling, and real users would be affected.

Look at the types of errors you are seeing. HTTP 500 errors typically mean your server is overloaded or crashing. Timeouts mean the server is too slow to respond within the allowed window. Connection refused errors mean the server cannot accept any more connections at all.

Response Time Over Time

This chart is often the most revealing. Look for the hockey stick curve: a flat, stable response time that suddenly shoots upward. The point where the curve bends is your capacity limit. Everything to the left of that bend is your safe operating range. Everything to the right is trouble.

For example, if response times are flat at 150ms with up to 100 concurrent users, then spike to 3 seconds at 120 users, you know your current infrastructure comfortably handles 100 users but starts to buckle beyond that. That is an incredibly valuable piece of information for capacity planning.

Step 7: Act on Your Results

Data without action is useless. Here is what to do depending on what you found.

If performance is good across the board: Congratulations. Establish these results as your baseline. Save the test configuration and schedule it to run regularly -- weekly or after every deployment. This way you will catch performance regressions before your users do.

If you found bottlenecks: Common fixes include adding a caching layer (like Redis or a CDN) to reduce database load, optimizing slow database queries that show up under concurrency, and scaling horizontally by adding more application servers behind a load balancer. Focus on the endpoints with the worst response times first. Small improvements to high-traffic endpoints have the biggest overall impact.

If you hit capacity limits: Now you know your ceiling, which is exactly what load testing is for. Use this information to plan for growth. If you expect traffic to double in the next quarter, you know you need to increase capacity before that happens. Load testing turns "I think we'll be fine" into "I know we can handle it."

Next Steps

You have completed your first load test. Here is where to go from here:

Schedule recurring tests. Performance can degrade over time as new features are added and data grows. Run tests regularly to catch regressions early.
Integrate into your CI/CD pipeline. LoadForge can be triggered automatically as part of your deployment process, so every release gets tested before it reaches users.
Test more complex scenarios. Explore authenticated user flows, file uploads, WebSocket connections, and multi-step transactions. Real-world traffic is varied, and your tests should reflect that.
Deepen your understanding. Learn more about what load testing is and explore our guide on how to load test an API for more advanced techniques.

Common Questions

How many virtual users should I simulate?

Start with your expected peak number of concurrent users. If you are unsure what that number is, check your analytics. Look at the maximum number of simultaneous sessions you see during your busiest hours, then test at that level and beyond. A good rule of thumb is to test at 1.5 to 2 times your expected peak so you know you have headroom.

How long should my load test run?

For a quick sanity check, 5 minutes is the minimum to get meaningful data. For thorough performance analysis, run for 15 to 30 minutes. This gives enough time for caches to warm, connection pools to stabilize, and slow-building problems to surface. For soak testing -- where you are looking for memory leaks or gradual degradation -- run for several hours.

Will load testing affect my production site?

Yes. Load testing generates real HTTP traffic against your application. It will consume real server resources and can absolutely impact real users if you run it against production. Whenever possible, test against a staging environment that mirrors your production setup. If you must test production, schedule your test during a low-traffic window and start with a conservative number of users.

Do I need to know Python?

A basic understanding of Python helps you customize your test scripts, but it is not a hard requirement. LoadForge provides test templates for common scenarios that you can use with minimal modification. There is also a visual test builder that lets you create tests without writing code at all. That said, learning the basics of Python and Locust gives you far more flexibility and control over your tests.

Load testing does not have to be complicated. With a simple script, the right settings, and a clear understanding of your results, you can gain confidence that your application is ready for whatever traffic comes its way. The hardest part is running that first test -- and you just did it.

Load Testing Tutorial: Your First Test in 15 Minutes