LoadForge LogoLoadForge

AWS Load Testing Guide with LoadForge

AWS Load Testing Guide with LoadForge

Introduction

AWS applications are built to scale, but “built to scale” does not automatically mean “performant under real traffic.” Whether you’re running APIs behind API Gateway, workloads on Application Load Balancers, Lambda functions, ECS services, or microservices backed by DynamoDB and S3, you need load testing to understand how your AWS architecture behaves under pressure.

A proper AWS load testing strategy helps you answer critical questions:

  • How many concurrent users can your application support?
  • Where do latency spikes begin?
  • Does API Gateway throttle requests under burst traffic?
  • Do Lambda cold starts affect user experience?
  • Can your ECS or EC2-backed services scale quickly enough?
  • Are downstream services like DynamoDB, RDS, or S3 becoming bottlenecks?

In this guide, you’ll learn how to load test AWS applications and APIs using LoadForge and Locust. We’ll cover realistic AWS-specific performance testing scenarios, including authenticated API requests, API Gateway endpoints, file uploads to S3-compatible flows, and multi-step user journeys. You’ll also see how LoadForge’s distributed testing, cloud-based infrastructure, real-time reporting, CI/CD integration, and global test locations can help you run meaningful AWS load testing at scale.

Prerequisites

Before you start load testing AWS workloads, make sure you have the following:

  • An AWS application or API endpoint to test
    • Examples:
      • API Gateway REST or HTTP API
      • ALB-backed ECS service
      • EC2-hosted web app
      • Lambda function behind API Gateway
  • A clear understanding of your target environment
    • Staging is strongly recommended over production
  • Test credentials or authentication tokens
    • JWT tokens from Cognito
    • API keys for API Gateway usage plans
    • Session-based auth for web applications
  • Knowledge of expected user behavior
    • Login, browse, search, create, upload, checkout, etc.
  • LoadForge account and project
  • Any relevant rate limits, WAF rules, or IP allowlists configured to permit your test traffic

You should also define your test goals in advance. For example:

  • Validate average response time under 500 concurrent users
  • Measure p95 latency for /api/orders
  • Stress test API Gateway burst handling
  • Verify autoscaling behavior for ECS tasks
  • Identify Lambda cold start impact during traffic spikes

Understanding AWS Under Load

AWS provides a highly elastic platform, but each service scales differently and introduces its own performance characteristics. Understanding these behaviors is essential for effective load testing and performance testing.

API Gateway

API Gateway is excellent for exposing APIs, but it can introduce:

  • Throttling at account or stage level
  • Increased latency under burst traffic
  • Integration overhead with Lambda or HTTP backends
  • Rate limiting issues when API keys or usage plans are involved

When load testing API Gateway, watch for:

  • 429 Too Many Requests
  • Increased p95 and p99 latency
  • Backend timeout errors
  • Integration latency versus overall latency

Lambda

Lambda can scale rapidly, but common bottlenecks include:

  • Cold starts
  • Concurrency limits
  • Slow downstream dependencies
  • Function duration increases under load

In stress testing scenarios, Lambda-backed APIs may appear healthy at low traffic but degrade sharply when concurrency rises.

ECS, EC2, and ALB-backed Services

For containerized or VM-hosted services, common issues include:

  • Insufficient CPU or memory
  • Slow autoscaling response
  • Connection pool exhaustion
  • Database bottlenecks
  • Load balancer target saturation

Load testing helps identify whether your application tier, not just AWS infrastructure, is the limiting factor.

DynamoDB, RDS, and Other Data Stores

Your application may scale horizontally, but the database often becomes the real bottleneck. Under load, you may encounter:

  • DynamoDB throttling
  • RDS connection exhaustion
  • Slow queries
  • Lock contention
  • Cache misses causing backend overload

A realistic AWS load testing plan should simulate full user workflows, not just isolated endpoint hits.

Writing Your First Load Test

Let’s start with a basic Locust script that load tests an AWS API Gateway endpoint serving a product catalog. This type of test is useful for measuring baseline read performance and understanding how your AWS API behaves under concurrent traffic.

Basic AWS API Gateway Load Test

python
from locust import HttpUser, task, between
 
class AwsApiUser(HttpUser):
    wait_time = between(1, 3)
    host = "https://api.example.execute-api.us-east-1.amazonaws.com"
 
    @task(3)
    def list_products(self):
        self.client.get(
            "/prod/products?category=electronics&limit=20",
            headers={
                "Accept": "application/json",
                "x-api-key": "your-api-gateway-api-key"
            },
            name="GET /prod/products"
        )
 
    @task(1)
    def get_product_detail(self):
        self.client.get(
            "/prod/products/sku-12345",
            headers={
                "Accept": "application/json",
                "x-api-key": "your-api-gateway-api-key"
            },
            name="GET /prod/products/:id"
        )

What this script does

This script simulates users browsing an AWS-hosted product API:

  • list_products represents a common catalog page request
  • get_product_detail simulates opening a product detail page
  • The x-api-key header reflects a realistic API Gateway authentication pattern
  • The weighted tasks make listing products more common than viewing an individual item

This is a good starting point for:

  • Baseline load testing
  • Measuring read-heavy API performance
  • Comparing behavior before and after infrastructure changes
  • Validating API Gateway caching or backend optimizations

In LoadForge, you can scale this simple script across multiple generators and regions to test how your AWS API performs for geographically distributed users.

Advanced Load Testing Scenarios

Basic endpoint testing is useful, but real AWS performance testing should model realistic user behavior and backend complexity. Below are several advanced Locust examples tailored to AWS environments.

Scenario 1: Cognito Authentication and Authenticated API Requests

Many AWS applications use Amazon Cognito for authentication. In this example, users log in and then call protected API Gateway endpoints with a JWT access token.

python
from locust import HttpUser, task, between
import json
 
class CognitoApiUser(HttpUser):
    wait_time = between(1, 2)
    host = "https://api.example.com"
 
    def on_start(self):
        cognito_host = "https://cognito-idp.us-east-1.amazonaws.com"
        auth_payload = {
            "AuthFlow": "USER_PASSWORD_AUTH",
            "ClientId": "4j2exampleclientid89abc",
            "AuthParameters": {
                "USERNAME": "loadtest.user@example.com",
                "PASSWORD": "SuperSecurePassword123!"
            }
        }
 
        response = self.client.post(
            cognito_host,
            data=json.dumps(auth_payload),
            headers={
                "Content-Type": "application/x-amz-json-1.1",
                "X-Amz-Target": "AWSCognitoIdentityProviderService.InitiateAuth"
            },
            name="Cognito InitiateAuth",
            catch_response=True
        )
 
        if response.status_code == 200:
            token_data = response.json()
            self.access_token = token_data["AuthenticationResult"]["AccessToken"]
        else:
            response.failure(f"Cognito auth failed: {response.text}")
            self.access_token = None
 
    @task(2)
    def get_profile(self):
        if not self.access_token:
            return
 
        self.client.get(
            "/v1/users/me",
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Accept": "application/json"
            },
            name="GET /v1/users/me"
        )
 
    @task(3)
    def list_orders(self):
        if not self.access_token:
            return
 
        self.client.get(
            "/v1/orders?status=active&limit=10",
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Accept": "application/json"
            },
            name="GET /v1/orders"
        )
 
    @task(1)
    def create_cart_item(self):
        if not self.access_token:
            return
 
        payload = {
            "productId": "sku-12345",
            "quantity": 2,
            "currency": "USD"
        }
 
        self.client.post(
            "/v1/cart/items",
            json=payload,
            headers={
                "Authorization": f"Bearer {self.access_token}",
                "Content-Type": "application/json"
            },
            name="POST /v1/cart/items"
        )

Why this matters for AWS load testing

This scenario is much more representative of production traffic because it includes:

  • Cognito authentication overhead
  • JWT-protected API access
  • Read and write operations
  • Stateful user behavior

This type of performance testing is especially useful when your AWS application uses:

  • API Gateway authorizers
  • Lambda authorizers
  • Cognito User Pools
  • ECS or Lambda backends with authenticated routes

It can reveal issues like:

  • Slow token validation
  • Authorizer latency
  • Backend performance degradation for authenticated requests
  • Increased error rates during login spikes

Scenario 2: Load Testing an ALB-Backed ECS Order Workflow

Now let’s simulate a multi-step e-commerce flow running behind an AWS Application Load Balancer and ECS service. This is a realistic way to test not just individual endpoints, but a full business transaction.

python
from locust import HttpUser, task, between
import random
 
class EcommerceUser(HttpUser):
    wait_time = between(2, 5)
    host = "https://shop.example.com"
 
    product_ids = [
        "sku-10001", "sku-10002", "sku-10003", "sku-10004"
    ]
 
    def on_start(self):
        self.client.post(
            "/auth/login",
            json={
                "email": "loadtest.customer@example.com",
                "password": "CustomerPass123!"
            },
            headers={"Content-Type": "application/json"},
            name="POST /auth/login"
        )
 
    @task(4)
    def browse_catalog(self):
        category = random.choice(["electronics", "books", "home", "fitness"])
        self.client.get(
            f"/api/catalog?category={category}&page=1&pageSize=24",
            name="GET /api/catalog"
        )
 
    @task(2)
    def view_product(self):
        product_id = random.choice(self.product_ids)
        self.client.get(
            f"/api/products/{product_id}",
            name="GET /api/products/:id"
        )
 
    @task(1)
    def add_to_cart_and_checkout(self):
        product_id = random.choice(self.product_ids)
 
        self.client.post(
            "/api/cart/items",
            json={
                "productId": product_id,
                "quantity": 1
            },
            headers={"Content-Type": "application/json"},
            name="POST /api/cart/items"
        )
 
        self.client.post(
            "/api/checkout",
            json={
                "shippingAddress": {
                    "name": "Load Test User",
                    "line1": "123 Cloud Street",
                    "city": "Seattle",
                    "state": "WA",
                    "postalCode": "98101",
                    "country": "US"
                },
                "paymentMethod": {
                    "type": "card",
                    "token": "tok_loadforge_test_visa"
                }
            },
            headers={"Content-Type": "application/json"},
            name="POST /api/checkout"
        )

What this test reveals

This script is ideal for stress testing AWS-hosted web applications because it exercises:

  • ALB request routing
  • ECS service capacity
  • Session management
  • Cart and checkout logic
  • Database writes and transactional operations

It helps uncover:

  • Slow application response times under mixed traffic
  • Autoscaling delays in ECS services
  • RDS performance bottlenecks
  • Uneven ALB target distribution
  • High-latency write operations

In LoadForge, you can run this test with increasing user counts to identify exactly when your application starts to degrade.

Scenario 3: Pre-Signed S3 Upload Workflow

A common AWS pattern is generating a pre-signed S3 upload URL from an API, then uploading directly to S3. This is worth load testing because both the signing endpoint and the upload path can become bottlenecks.

python
from locust import HttpUser, task, between
import uuid
 
class S3UploadUser(HttpUser):
    wait_time = between(1, 3)
    host = "https://api.example.com"
 
    @task
    def upload_document(self):
        file_name = f"uploads/{uuid.uuid4()}.pdf"
 
        sign_response = self.client.post(
            "/v1/files/presign-upload",
            json={
                "fileName": file_name,
                "contentType": "application/pdf",
                "folder": "customer-documents"
            },
            headers={"Content-Type": "application/json"},
            name="POST /v1/files/presign-upload",
            catch_response=True
        )
 
        if sign_response.status_code != 200:
            sign_response.failure(f"Failed to get pre-signed URL: {sign_response.text}")
            return
 
        upload_data = sign_response.json()
        upload_url = upload_data["uploadUrl"]
 
        pdf_content = b"%PDF-1.4 simulated load test pdf content"
 
        s3_response = self.client.put(
            upload_url,
            data=pdf_content,
            headers={"Content-Type": "application/pdf"},
            name="PUT S3 pre-signed upload"
        )
 
        if s3_response.status_code not in [200, 204]:
            s3_response.failure(f"S3 upload failed with status {s3_response.status_code}")

Why this is useful

This AWS load testing scenario measures:

  • Performance of your pre-signing API
  • S3 upload behavior under concurrency
  • End-to-end file ingestion flow

It’s especially relevant for applications handling:

  • User document uploads
  • Media ingestion
  • Backup or export workflows
  • Large file processing pipelines

When you run this kind of test, pay attention to:

  • Time spent generating pre-signed URLs
  • Upload latency distribution
  • Failures caused by object size, permissions, or regional issues
  • Downstream processing triggers from S3 events

Analyzing Your Results

Running a load test is only useful if you know how to interpret the results. With AWS applications, you should combine LoadForge metrics with AWS-native observability tools such as CloudWatch, X-Ray, ALB metrics, API Gateway metrics, Lambda Insights, and database monitoring.

Key metrics to watch in LoadForge

Response time percentiles

Average response time can hide problems. Focus on:

  • p50 for typical experience
  • p95 for degraded but common slowness
  • p99 for worst-case experience

For AWS APIs, a rising p95 often indicates:

  • Backend saturation
  • Lambda cold starts
  • Database contention
  • API Gateway integration delays

Requests per second

Track how throughput changes as concurrency increases. If requests per second flatten while user count rises, you may have hit a scaling limit.

Error rate

Look for:

  • 429 responses from API Gateway or DynamoDB-backed services
  • 502/503 from ALB or upstream services
  • 504 gateway timeouts
  • 500 application errors

Per-endpoint breakdown

Identify which routes degrade first. In many AWS systems:

  • Read endpoints remain stable longer
  • Write-heavy endpoints fail first
  • Authentication endpoints become hotspots during login surges

Correlate with AWS metrics

Match LoadForge traffic patterns with AWS telemetry:

  • API Gateway
    • Count
    • Latency
    • IntegrationLatency
    • 4XXError
    • 5XXError
  • Lambda
    • Duration
    • ConcurrentExecutions
    • Throttles
    • Errors
    • Init Duration
  • ALB
    • TargetResponseTime
    • HTTPCode_Target_5XX_Count
    • RequestCount
  • ECS
    • CPUUtilization
    • MemoryUtilization
    • RunningTaskCount
  • RDS
    • CPU
    • Connections
    • Read/Write latency
    • Freeable memory
  • DynamoDB
    • ThrottledRequests
    • ConsumedReadCapacityUnits
    • ConsumedWriteCapacityUnits

LoadForge’s real-time reporting makes it easy to see when latency and error rates change during a test, while AWS metrics help explain why.

Performance Optimization Tips

After load testing your AWS application, you’ll usually identify one or more bottlenecks. Here are common optimization opportunities.

Optimize API Gateway and Lambda

  • Enable caching where appropriate
  • Reduce payload size
  • Minimize Lambda cold starts with provisioned concurrency if needed
  • Reuse SDK clients and database connections inside Lambda execution environments
  • Keep authorizers lightweight

Improve ECS or EC2 service performance

  • Tune autoscaling thresholds
  • Increase task or instance capacity before peak traffic
  • Use connection pooling for databases
  • Add application-level caching with ElastiCache
  • Optimize CPU and memory allocation for containers

Tune your database layer

  • Add indexes for slow queries
  • Use read replicas where appropriate
  • Reduce transaction scope
  • Cache hot data
  • For DynamoDB, review partition key design and provisioned/on-demand settings

Reduce client-visible latency

  • Compress responses
  • Serve static assets via CloudFront
  • Move large uploads directly to S3
  • Avoid chatty APIs by consolidating requests

Test from multiple regions

If your users are global, performance can vary significantly by geography. LoadForge’s global test locations help you measure latency from the regions your users actually care about.

Common Pitfalls to Avoid

AWS load testing can produce misleading results if the test is poorly designed. Avoid these common mistakes.

Testing only one endpoint

A single GET request rarely reflects real production behavior. Model realistic workflows that include authentication, reads, writes, and background processing triggers.

Ignoring rate limits and throttling

API Gateway, Lambda, DynamoDB, and other AWS services can throttle traffic. If you don’t account for this, you may interpret expected throttling as an application failure.

Not warming up the system

Jumping instantly to high concurrency can distort results, especially for Lambda or autoscaling services. Use a ramp-up period to observe how AWS scaling responds.

Testing production without safeguards

Stress testing production can affect real users and trigger autoscaling costs. Prefer staging environments that mirror production as closely as possible.

Using unrealistic test data

If every virtual user logs in with the same account or requests the same resource, your results may be skewed by caching or lock contention. Use varied data where possible.

Forgetting downstream dependencies

Your API may be healthy, but RDS, Redis, third-party APIs, or S3 event consumers may not be. End-to-end performance testing is essential.

Not correlating with AWS observability

LoadForge shows what users experience. AWS monitoring shows what infrastructure is doing. You need both to diagnose performance issues accurately.

Conclusion

AWS gives you powerful building blocks for scalable applications, but real scalability only becomes clear when you test under realistic load. Whether you’re load testing API Gateway, Lambda, ECS, ALB-backed services, or S3 upload flows, the goal is the same: find bottlenecks early, measure how your system scales, and improve performance before users feel the pain.

With LoadForge, you can build realistic Locust-based AWS load tests, run them with distributed cloud-based infrastructure, analyze results in real time, and integrate performance testing into your CI/CD workflow. If you want to validate your AWS architecture with confidence, now is the perfect time to try LoadForge and start load testing your applications at scale.

Try LoadForge free for 7 days

Set up your first load test in under 2 minutes. No commitment.