Introduction

AWS applications are built to scale, but “built to scale” does not automatically mean “performant under real traffic.” Whether you’re running APIs behind API Gateway, workloads on Application Load Balancers, Lambda functions, ECS services, or microservices backed by DynamoDB and S3, you need load testing to understand how your AWS architecture behaves under pressure.

A proper AWS load testing strategy helps you answer critical questions:

How many concurrent users can your application support?
Where do latency spikes begin?
Does API Gateway throttle requests under burst traffic?
Do Lambda cold starts affect user experience?
Can your ECS or EC2-backed services scale quickly enough?
Are downstream services like DynamoDB, RDS, or S3 becoming bottlenecks?

In this guide, you’ll learn how to load test AWS applications and APIs using LoadForge and Locust. We’ll cover realistic AWS-specific performance testing scenarios, including authenticated API requests, API Gateway endpoints, file uploads to S3-compatible flows, and multi-step user journeys. You’ll also see how LoadForge’s distributed testing, cloud-based infrastructure, real-time reporting, CI/CD integration, and global test locations can help you run meaningful AWS load testing at scale.

Prerequisites

Before you start load testing AWS workloads, make sure you have the following:

An AWS application or API endpoint to test
- Examples:
  - API Gateway REST or HTTP API
  - ALB-backed ECS service
  - EC2-hosted web app
  - Lambda function behind API Gateway
A clear understanding of your target environment
- Staging is strongly recommended over production
Test credentials or authentication tokens
- JWT tokens from Cognito
- API keys for API Gateway usage plans
- Session-based auth for web applications
Knowledge of expected user behavior
- Login, browse, search, create, upload, checkout, etc.
LoadForge account and project
Any relevant rate limits, WAF rules, or IP allowlists configured to permit your test traffic

You should also define your test goals in advance. For example:

Validate average response time under 500 concurrent users
Measure p95 latency for /api/orders
Stress test API Gateway burst handling
Verify autoscaling behavior for ECS tasks
Identify Lambda cold start impact during traffic spikes

Understanding AWS Under Load

AWS provides a highly elastic platform, but each service scales differently and introduces its own performance characteristics. Understanding these behaviors is essential for effective load testing and performance testing.

API Gateway

API Gateway is excellent for exposing APIs, but it can introduce:

Throttling at account or stage level
Increased latency under burst traffic
Integration overhead with Lambda or HTTP backends
Rate limiting issues when API keys or usage plans are involved

When load testing API Gateway, watch for:

429 Too Many Requests
Increased p95 and p99 latency
Backend timeout errors
Integration latency versus overall latency

Lambda

Lambda can scale rapidly, but common bottlenecks include:

Cold starts
Concurrency limits
Slow downstream dependencies
Function duration increases under load

In stress testing scenarios, Lambda-backed APIs may appear healthy at low traffic but degrade sharply when concurrency rises.

ECS, EC2, and ALB-backed Services

For containerized or VM-hosted services, common issues include:

Insufficient CPU or memory
Slow autoscaling response
Connection pool exhaustion
Database bottlenecks
Load balancer target saturation

Load testing helps identify whether your application tier, not just AWS infrastructure, is the limiting factor.

DynamoDB, RDS, and Other Data Stores

Your application may scale horizontally, but the database often becomes the real bottleneck. Under load, you may encounter:

DynamoDB throttling
RDS connection exhaustion
Slow queries
Lock contention
Cache misses causing backend overload

A realistic AWS load testing plan should simulate full user workflows, not just isolated endpoint hits.

Writing Your First Load Test

Let’s start with a basic Locust script that load tests an AWS API Gateway endpoint serving a product catalog. This type of test is useful for measuring baseline read performance and understanding how your AWS API behaves under concurrent traffic.

Basic AWS API Gateway Load Test

python

from locust import HttpUser, task, between
 
class AwsApiUser(HttpUser):
    wait_time = between(1, 3)
    host = "https://api.example.execute-api.us-east-1.amazonaws.com"
 
    @task(3)
    def list_products(self):
        self.client.get(
            "/prod/products?category=electronics&limit=20",
            headers={
                "Accept": "application/json",
                "x-api-key": "your-api-gateway-api-key"
            },
            name="GET /prod/products"
        )
 
    @task(1)
    def get_product_detail(self):
        self.client.get(
            "/prod/products/sku-12345",
            headers={
                "Accept": "application/json",
                "x-api-key": "your-api-gateway-api-key"
            },
            name="GET /prod/products/:id"
        )

What this script does

This script simulates users browsing an AWS-hosted product API:

list_products represents a common catalog page request
get_product_detail simulates opening a product detail page
The x-api-key header reflects a realistic API Gateway authentication pattern
The weighted tasks make listing products more common than viewing an individual item

This is a good starting point for:

Baseline load testing
Measuring read-heavy API performance
Comparing behavior before and after infrastructure changes
Validating API Gateway caching or backend optimizations

In LoadForge, you can scale this simple script across multiple generators and regions to test how your AWS API performs for geographically distributed users.

Advanced Load Testing Scenarios

Basic endpoint testing is useful, but real AWS performance testing should model realistic user behavior and backend complexity. Below are several advanced Locust examples tailored to AWS environments.

Scenario 1: Cognito Authentication and Authenticated API Requests

Many AWS applications use Amazon Cognito for authentication. In this example, users log in and then call protected API Gateway endpoints with a JWT access token.

python

from locust import HttpUser, task, between
import json
 
class CognitoApiUser(HttpUser):
    wait_time = between(1, 2)
    host = "https://api.example.com"
 
    def on_start(self):
        cognito_host = "https://cognito-idp.us-east-1.amazonaws.com"
        auth_payload = {
            "AuthFlow": "USER_PASSWORD_AUTH",
            "ClientId": "4j2exampleclientid89abc",
            "AuthParameters": {
                "USERNAME": "loadtest.user@example.com",
                "PASSWORD": "SuperSecurePassword123!"
            }
        }
 
        response = self.client.post(
            cognito_host,
            data=json.dumps(auth_payload),
            headers={
                "Content-Type": "application/x-amz-json-1.1",
                "X-Amz-Target": "AWSCognitoIdentityProviderService.InitiateAuth"
            },
            name="Cognito InitiateAuth",
            catch_response=True
        )
 
        if response.status_code == 200:
            token_data = response.json()
            self.access_token = token_data["AuthenticationResult"]["AccessToken"]
        else:
            response.failure(f"Cognito auth failed: {response.text}")
            self.access_token = None
 
    @task(2)
    def get_profile(self):
        if not self.access_token:
            return
 
        self.client.get(
            "/v1/users/me",
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Accept": "application/json"
            },
            name="GET /v1/users/me"
        )
 
    @task(3)
    def list_orders(self):
        if not self.access_token:
            return
 
        self.client.get(
            "/v1/orders?status=active&limit=10",
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Accept": "application/json"
            },
            name="GET /v1/orders"
        )
 
    @task(1)
    def create_cart_item(self):
        if not self.access_token:
            return
 
        payload = {
            "productId": "sku-12345",
            "quantity": 2,
            "currency": "USD"
        }
 
        self.client.post(
            "/v1/cart/items",
            json=payload,
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Content-Type": "application/json"
            },
            name="POST /v1/cart/items"
        )

Why this matters for AWS load testing

This scenario is much more representative of production traffic because it includes:

Cognito authentication overhead
JWT-protected API access
Read and write operations
Stateful user behavior

This type of performance testing is especially useful when your AWS application uses:

API Gateway authorizers
Lambda authorizers
Cognito User Pools
ECS or Lambda backends with authenticated routes

It can reveal issues like:

Slow token validation
Authorizer latency
Backend performance degradation for authenticated requests
Increased error rates during login spikes

Scenario 2: Load Testing an ALB-Backed ECS Order Workflow

Now let’s simulate a multi-step e-commerce flow running behind an AWS Application Load Balancer and ECS service. This is a realistic way to test not just individual endpoints, but a full business transaction.

python

from locust import HttpUser, task, between
import random
 
class EcommerceUser(HttpUser):
    wait_time = between(2, 5)
    host = "https://shop.example.com"
 
    product_ids = [
        "sku-10001", "sku-10002", "sku-10003", "sku-10004"
    ]
 
    def on_start(self):
        self.client.post(
            "/auth/login",
            json={
                "email": "loadtest.customer@example.com",
                "password": "CustomerPass123!"
            },
            headers={"Content-Type": "application/json"},
            name="POST /auth/login"
        )
 
    @task(4)
    def browse_catalog(self):
        category = random.choice(["electronics", "books", "home", "fitness"])
        self.client.get(
            f"/api/catalog?category={category}&page=1&pageSize=24",
            name="GET /api/catalog"
        )
 
    @task(2)
    def view_product(self):
        product_id = random.choice(self.product_ids)
        self.client.get(
            f"/api/products/{product_id}",
            name="GET /api/products/:id"
        )
 
    @task(1)
    def add_to_cart_and_checkout(self):
        product_id = random.choice(self.product_ids)
 
        self.client.post(
            "/api/cart/items",
            json={
                "productId": product_id,
                "quantity": 1
            },
            headers={"Content-Type": "application/json"},
            name="POST /api/cart/items"
        )
 
        self.client.post(
            "/api/checkout",
            json={
                "shippingAddress": {
                    "name": "Load Test User",
                    "line1": "123 Cloud Street",
                    "city": "Seattle",
                    "state": "WA",
                    "postalCode": "98101",
                    "country": "US"
                },
                "paymentMethod": {
                    "type": "card",
                    "token": "tok_loadforge_test_visa"
                }
            },
            headers={"Content-Type": "application/json"},
            name="POST /api/checkout"
        )

What this test reveals

This script is ideal for stress testing AWS-hosted web applications because it exercises:

ALB request routing
ECS service capacity
Session management
Cart and checkout logic
Database writes and transactional operations

It helps uncover:

Slow application response times under mixed traffic
Autoscaling delays in ECS services
RDS performance bottlenecks
Uneven ALB target distribution
High-latency write operations

In LoadForge, you can run this test with increasing user counts to identify exactly when your application starts to degrade.

Scenario 3: Pre-Signed S3 Upload Workflow

A common AWS pattern is generating a pre-signed S3 upload URL from an API, then uploading directly to S3. This is worth load testing because both the signing endpoint and the upload path can become bottlenecks.

python

from locust import HttpUser, task, between
import uuid
 
class S3UploadUser(HttpUser):
    wait_time = between(1, 3)
    host = "https://api.example.com"
 
    @task
    def upload_document(self):
        file_name = f"uploads/{uuid.uuid4()}.pdf"
 
        sign_response = self.client.post(
            "/v1/files/presign-upload",
            json={
                "fileName": file_name,
                "contentType": "application/pdf",
                "folder": "customer-documents"
            },
            headers={"Content-Type": "application/json"},
            name="POST /v1/files/presign-upload",
            catch_response=True
        )
 
        if sign_response.status_code != 200:
            sign_response.failure(f"Failed to get pre-signed URL: {sign_response.text}")
            return
 
        upload_data = sign_response.json()
        upload_url = upload_data["uploadUrl"]
 
        pdf_content = b"%PDF-1.4 simulated load test pdf content"
 
        s3_response = self.client.put(
            upload_url,
            data=pdf_content,
            headers={"Content-Type": "application/pdf"},
            name="PUT S3 pre-signed upload"
        )
 
        if s3_response.status_code not in [200, 204]:
            s3_response.failure(f"S3 upload failed with status {s3_response.status_code}")

Why this is useful

This AWS load testing scenario measures:

Performance of your pre-signing API
S3 upload behavior under concurrency
End-to-end file ingestion flow

It’s especially relevant for applications handling:

User document uploads
Media ingestion
Backup or export workflows
Large file processing pipelines

When you run this kind of test, pay attention to:

Time spent generating pre-signed URLs
Upload latency distribution
Failures caused by object size, permissions, or regional issues
Downstream processing triggers from S3 events

Analyzing Your Results

Running a load test is only useful if you know how to interpret the results. With AWS applications, you should combine LoadForge metrics with AWS-native observability tools such as CloudWatch, X-Ray, ALB metrics, API Gateway metrics, Lambda Insights, and database monitoring.

Key metrics to watch in LoadForge

Response time percentiles

Average response time can hide problems. Focus on:

p50 for typical experience
p95 for degraded but common slowness
p99 for worst-case experience

For AWS APIs, a rising p95 often indicates:

Backend saturation
Lambda cold starts
Database contention
API Gateway integration delays

Requests per second

Track how throughput changes as concurrency increases. If requests per second flatten while user count rises, you may have hit a scaling limit.

Error rate

Look for:

429 responses from API Gateway or DynamoDB-backed services
502/503 from ALB or upstream services
504 gateway timeouts
500 application errors

Per-endpoint breakdown

Identify which routes degrade first. In many AWS systems:

Read endpoints remain stable longer
Write-heavy endpoints fail first
Authentication endpoints become hotspots during login surges

Correlate with AWS metrics

Match LoadForge traffic patterns with AWS telemetry:

API Gateway
- Count
- Latency
- IntegrationLatency
- 4XXError
- 5XXError
Lambda
- Duration
- ConcurrentExecutions
- Throttles
- Errors
- Init Duration
ALB
- TargetResponseTime
- HTTPCode_Target_5XX_Count
- RequestCount
ECS
- CPUUtilization
- MemoryUtilization
- RunningTaskCount
RDS
- CPU
- Connections
- Read/Write latency
- Freeable memory
DynamoDB
- ThrottledRequests
- ConsumedReadCapacityUnits
- ConsumedWriteCapacityUnits

LoadForge’s real-time reporting makes it easy to see when latency and error rates change during a test, while AWS metrics help explain why.

Performance Optimization Tips

After load testing your AWS application, you’ll usually identify one or more bottlenecks. Here are common optimization opportunities.

Optimize API Gateway and Lambda

Enable caching where appropriate
Reduce payload size
Minimize Lambda cold starts with provisioned concurrency if needed
Reuse SDK clients and database connections inside Lambda execution environments
Keep authorizers lightweight

Improve ECS or EC2 service performance

Tune autoscaling thresholds
Increase task or instance capacity before peak traffic
Use connection pooling for databases
Add application-level caching with ElastiCache
Optimize CPU and memory allocation for containers

Tune your database layer

Add indexes for slow queries
Use read replicas where appropriate
Reduce transaction scope
Cache hot data
For DynamoDB, review partition key design and provisioned/on-demand settings

Reduce client-visible latency

Compress responses
Serve static assets via CloudFront
Move large uploads directly to S3
Avoid chatty APIs by consolidating requests

Test from multiple regions

If your users are global, performance can vary significantly by geography. LoadForge’s global test locations help you measure latency from the regions your users actually care about.

Common Pitfalls to Avoid

AWS load testing can produce misleading results if the test is poorly designed. Avoid these common mistakes.

Testing only one endpoint

A single GET request rarely reflects real production behavior. Model realistic workflows that include authentication, reads, writes, and background processing triggers.

Ignoring rate limits and throttling

API Gateway, Lambda, DynamoDB, and other AWS services can throttle traffic. If you don’t account for this, you may interpret expected throttling as an application failure.

Not warming up the system

Jumping instantly to high concurrency can distort results, especially for Lambda or autoscaling services. Use a ramp-up period to observe how AWS scaling responds.

Testing production without safeguards

Stress testing production can affect real users and trigger autoscaling costs. Prefer staging environments that mirror production as closely as possible.

Using unrealistic test data

If every virtual user logs in with the same account or requests the same resource, your results may be skewed by caching or lock contention. Use varied data where possible.

Forgetting downstream dependencies

Your API may be healthy, but RDS, Redis, third-party APIs, or S3 event consumers may not be. End-to-end performance testing is essential.

Not correlating with AWS observability

LoadForge shows what users experience. AWS monitoring shows what infrastructure is doing. You need both to diagnose performance issues accurately.

Conclusion

AWS gives you powerful building blocks for scalable applications, but real scalability only becomes clear when you test under realistic load. Whether you’re load testing API Gateway, Lambda, ECS, ALB-backed services, or S3 upload flows, the goal is the same: find bottlenecks early, measure how your system scales, and improve performance before users feel the pain.

With LoadForge, you can build realistic Locust-based AWS load tests, run them with distributed cloud-based infrastructure, analyze results in real time, and integrate performance testing into your CI/CD workflow. If you want to validate your AWS architecture with confidence, now is the perfect time to try LoadForge and start load testing your applications at scale.

AWS Load Testing Guide with LoadForge

Introduction

Prerequisites

Understanding AWS Under Load

API Gateway

Lambda

ECS, EC2, and ALB-backed Services

DynamoDB, RDS, and Other Data Stores

Writing Your First Load Test

Basic AWS API Gateway Load Test

What this script does

Advanced Load Testing Scenarios

Scenario 1: Cognito Authentication and Authenticated API Requests

Why this matters for AWS load testing

Scenario 2: Load Testing an ALB-Backed ECS Order Workflow

What this test reveals

Scenario 3: Pre-Signed S3 Upload Workflow

Why this is useful

Analyzing Your Results

Key metrics to watch in LoadForge

Response time percentiles

Requests per second

Error rate

Per-endpoint breakdown

Correlate with AWS metrics

Performance Optimization Tips

Optimize API Gateway and Lambda

Improve ECS or EC2 service performance

Tune your database layer

Reduce client-visible latency

Test from multiple regions

Common Pitfalls to Avoid

Testing only one endpoint

Ignoring rate limits and throttling

Not warming up the system

Testing production without safeguards

Using unrealistic test data

Forgetting downstream dependencies

Not correlating with AWS observability

Conclusion

Try LoadForge free for 7 days

Related guides

Apache Load Testing Guide with LoadForge

Azure Functions Load Testing Guide

Caddy Load Testing Guide with LoadForge