LoadForge LogoLoadForge

Fly.io Load Testing Guide with LoadForge

Fly.io Load Testing Guide with LoadForge

Introduction

Running applications on Fly.io gives developers a powerful edge-first deployment model, global regions, and the ability to place workloads close to users. That architecture can dramatically improve responsiveness, but it also changes how you should approach load testing. When your app is distributed across regions, traditional single-origin performance testing is no longer enough. You need to understand global latency, concurrency behavior, cache effectiveness, cold-start patterns, and how your Fly.io app performs when traffic spikes hit multiple edge locations at once.

This Fly.io load testing guide with LoadForge shows you how to measure the real-world performance of applications deployed on Fly.io. We’ll cover how Fly.io applications behave under load, how to write Locust-based test scripts for realistic scenarios, and how to analyze the results to identify bottlenecks before they affect users.

Because LoadForge is built on Locust, you can create flexible Python-based load testing and stress testing scenarios while taking advantage of cloud-based infrastructure, distributed testing, real-time reporting, CI/CD integration, and global test locations. That makes it especially useful for testing Fly.io applications, where geography and edge routing are part of the performance story.

Prerequisites

Before you start load testing your Fly.io application, make sure you have:

  • A Fly.io application deployed and reachable over HTTPS
  • The application hostname, such as:
    • https://myapp.fly.dev
    • or a custom domain mapped to Fly.io
  • A basic understanding of your application’s key user flows
  • Any authentication credentials or API tokens needed for testing
  • Test data that can safely be used in a staging or production-like environment
  • A LoadForge account to run distributed load tests from multiple regions

It also helps to know:

  • Which Fly.io regions your app is deployed in
  • Whether your app uses:
    • Fly Machines
    • autoscaling
    • edge caching
    • Postgres or external databases
    • WebSocket or API-heavy traffic
  • Any rate limiting, WAF, or auth middleware that may affect test users

For the examples below, we’ll assume a realistic Fly.io-hosted SaaS API with endpoints like:

  • GET /health
  • POST /api/v1/auth/login
  • GET /api/v1/projects
  • GET /api/v1/projects/{id}/metrics
  • POST /api/v1/uploads/presign
  • POST /api/v1/events/ingest

These patterns are common for applications deployed on Fly.io, especially globally distributed APIs and edge-facing services.

Understanding Fly.io Under Load

Fly.io is designed to run apps close to users, but that doesn’t automatically guarantee good performance under heavy traffic. Load testing Fly.io applications requires thinking about several layers of behavior.

Regional routing and latency

A Fly.io app may route users to the nearest healthy region, but latency still depends on:

  • where the request originates
  • whether the app instance is warm
  • how traffic is balanced across regions
  • whether backend services are centralized elsewhere

If your app runs in multiple Fly.io regions but your database lives in only one, users may see fast edge connection times but slower full response times for database-heavy endpoints.

Concurrency and instance saturation

Fly.io applications can handle concurrency differently depending on:

  • runtime and framework
  • CPU and memory allocation
  • connection pooling
  • autoscaling thresholds
  • per-instance request limits

A lightweight health endpoint may scale well, while authenticated API requests with database queries may degrade quickly once instance concurrency is saturated.

Cold starts and machine startup behavior

If you use Fly Machines or scaled-to-zero patterns, sudden bursts of traffic may trigger startup delays. A load test can reveal:

  • how long new instances take to become responsive
  • whether cold starts affect only certain endpoints
  • how latency changes during rapid ramp-up

Edge performance versus origin performance

Some Fly.io apps benefit from edge caching or static asset acceleration, while dynamic API endpoints still depend on application logic and data access. This means your performance testing strategy should separate:

  • cacheable requests
  • authenticated API traffic
  • write-heavy workloads
  • background event ingestion

Common bottlenecks in Fly.io deployments

When load testing Fly.io applications, the most common bottlenecks include:

  • database latency from a single-region backend
  • insufficient connection pooling
  • CPU exhaustion on small VM sizes
  • slow startup time during scaling events
  • application-level locks or synchronous processing
  • rate limiting misconfiguration
  • file upload flows that depend on object storage latency

The goal of load testing is not just to find the maximum requests per second. It’s to identify where the Fly.io architecture performs well and where edge distribution stops helping because another dependency becomes the bottleneck.

Writing Your First Load Test

Let’s start with a simple Fly.io load test that validates baseline responsiveness. This is useful for smoke testing, health checks, and measuring latency from different LoadForge regions.

Basic health and homepage test

python
from locust import HttpUser, task, between
 
class FlyAppSmokeUser(HttpUser):
    wait_time = between(1, 3)
 
    @task(3)
    def health_check(self):
        self.client.get("/health", name="GET /health")
 
    @task(1)
    def homepage(self):
        self.client.get("/", name="GET /")

How this test works

This script simulates a lightweight user checking two common endpoints:

  • GET /health for service health
  • GET / for the public landing page or root route

In LoadForge, set the host to your Fly.io app, for example:

bash
https://myapp.fly.dev

This basic load test is useful for:

  • verifying the app is reachable from multiple regions
  • measuring baseline latency
  • checking whether Fly.io edge routing is working as expected
  • spotting cold-start or startup delays during ramp-up

What to look for

When you run this in LoadForge, pay attention to:

  • median and p95 response times
  • failures during rapid user ramp-up
  • differences between test regions
  • whether the health endpoint remains stable even as the homepage slows down

If GET /health stays fast but GET / degrades, the issue is likely application rendering, upstream dependencies, or dynamic content generation rather than Fly.io network routing itself.

Advanced Load Testing Scenarios

Once the basics are covered, you should test realistic user behavior. For Fly.io apps, that often means authenticated APIs, write-heavy traffic, and geographically sensitive workloads.

Scenario 1: Authenticated API workflow

This example simulates a user logging in, retrieving projects, and fetching project metrics. This is a common SaaS pattern and a strong test of app logic, session handling, and database-backed reads.

python
from locust import HttpUser, task, between
import random
 
class FlyAuthenticatedApiUser(HttpUser):
    wait_time = between(1, 2)
 
    def on_start(self):
        response = self.client.post(
            "/api/v1/auth/login",
            json={
                "email": "loadtest.user@example.com",
                "password": "SuperSecurePass123!"
            },
            name="POST /api/v1/auth/login"
        )
 
        if response.status_code == 200:
            data = response.json()
            self.token = data.get("access_token")
            self.headers = {
                "Authorization": f"Bearer {self.token}",
                "Content-Type": "application/json"
            }
        else:
            self.token = None
            self.headers = {}
 
    @task(3)
    def list_projects(self):
        self.client.get(
            "/api/v1/projects",
            headers=self.headers,
            name="GET /api/v1/projects"
        )
 
    @task(2)
    def project_metrics(self):
        project_id = random.choice([101, 102, 103, 104])
        self.client.get(
            f"/api/v1/projects/{project_id}/metrics?range=24h&granularity=5m",
            headers=self.headers,
            name="GET /api/v1/projects/:id/metrics"
        )
 
    @task(1)
    def user_profile(self):
        self.client.get(
            "/api/v1/me",
            headers=self.headers,
            name="GET /api/v1/me"
        )

Why this matters for Fly.io

This test reveals how your Fly.io app handles:

  • JWT or bearer-token authentication
  • database-backed list endpoints
  • repeated metrics queries
  • session-independent API traffic across many concurrent users

It’s especially useful if your app runs globally on Fly.io but reads from a centralized database. You may find that auth succeeds quickly at the edge, but metrics endpoints slow down because data must travel to another region.

Scenario 2: Event ingestion and edge write traffic

Many Fly.io deployments act as globally distributed ingestion endpoints for telemetry, webhooks, or analytics. This scenario simulates clients sending events to an ingestion API.

python
from locust import HttpUser, task, between
import random
import uuid
from datetime import datetime
 
class FlyEventIngestionUser(HttpUser):
    wait_time = between(0.2, 1.0)
 
    def on_start(self):
        self.headers = {
            "Authorization": "Bearer lf_ingest_test_token_abc123",
            "Content-Type": "application/json",
            "User-Agent": "LoadForge-Fly-Ingest-Test/1.0"
        }
 
    @task
    def ingest_event(self):
        payload = {
            "event_id": str(uuid.uuid4()),
            "tenant_id": "tenant_demo_001",
            "source": "web",
            "event_type": random.choice([
                "page_view",
                "signup_started",
                "checkout_completed",
                "api_error"
            ]),
            "timestamp": datetime.utcnow().isoformat() + "Z",
            "region_hint": random.choice(["iad", "ord", "lhr", "fra", "sin"]),
            "properties": {
                "path": random.choice([
                    "/pricing",
                    "/signup",
                    "/dashboard",
                    "/api/v1/projects"
                ]),
                "response_time_ms": random.randint(45, 1200),
                "plan": random.choice(["free", "pro", "enterprise"])
            }
        }
 
        self.client.post(
            "/api/v1/events/ingest",
            json=payload,
            headers=self.headers,
            name="POST /api/v1/events/ingest"
        )

What this test uncovers

This kind of stress testing is ideal for Fly.io because it measures:

  • edge write performance from global locations
  • request validation overhead
  • queueing or async processing behavior
  • regional spikes and burst handling
  • CPU and memory pressure under high event throughput

If ingestion latency rises sharply under moderate concurrency, the bottleneck may be synchronous writes, limited worker capacity, or downstream queue/database contention rather than Fly.io itself.

Scenario 3: File upload preparation and signed URL flow

A very common pattern on Fly.io is using the app as an API gateway for upload flows. The app generates a signed upload URL, and the client then uploads to object storage. You should load test the app-controlled part of that workflow.

python
from locust import HttpUser, task, between
import uuid
import random
 
class FlyUploadWorkflowUser(HttpUser):
    wait_time = between(1, 3)
 
    def on_start(self):
        login_response = self.client.post(
            "/api/v1/auth/login",
            json={
                "email": "uploader@example.com",
                "password": "UploadFlowPass456!"
            },
            name="POST /api/v1/auth/login"
        )
 
        token = login_response.json().get("access_token")
        self.headers = {
            "Authorization": f"Bearer {token}",
            "Content-Type": "application/json"
        }
 
    @task(2)
    def create_presigned_upload(self):
        filename = f"report-{uuid.uuid4()}.csv"
        payload = {
            "filename": filename,
            "content_type": "text/csv",
            "size_bytes": random.randint(50_000, 5_000_000),
            "folder": "customer-exports"
        }
 
        with self.client.post(
            "/api/v1/uploads/presign",
            json=payload,
            headers=self.headers,
            name="POST /api/v1/uploads/presign",
            catch_response=True
        ) as response:
            if response.status_code != 200:
                response.failure(f"Unexpected status code: {response.status_code}")
                return
 
            data = response.json()
            if "upload_url" not in data or "file_id" not in data:
                response.failure("Missing upload_url or file_id in response")
 
    @task(1)
    def list_recent_uploads(self):
        self.client.get(
            "/api/v1/uploads?limit=20&status=pending,complete",
            headers=self.headers,
            name="GET /api/v1/uploads"
        )

Why this is realistic

This scenario reflects how many modern Fly.io apps actually work:

  • the application handles auth and upload authorization
  • object storage receives the actual file bytes
  • the app stores metadata and status in a database

This load test focuses on the Fly.io-hosted application layer, where performance issues often show up in:

  • auth checks
  • signed URL generation
  • metadata persistence
  • upload listing queries

If these endpoints are slow, users will perceive the upload flow as sluggish even if object storage is fast.

Analyzing Your Results

After running your Fly.io load test in LoadForge, the next step is understanding what the metrics actually mean.

Key metrics to monitor

For Fly.io performance testing, focus on:

  • response time percentiles: p50, p95, p99
  • requests per second
  • error rate
  • timeouts
  • throughput consistency during ramp-up
  • latency differences by region

Interpreting latency patterns

A few common result patterns are especially important for Fly.io apps:

Fast median, slow p95 or p99

This usually means:

  • some requests hit warm instances while others hit cold or overloaded ones
  • certain regions are slower than others
  • backend dependencies are inconsistent

Errors only during ramp-up

This often suggests:

  • autoscaling lag
  • startup delays
  • insufficient instance capacity
  • connection pool exhaustion

Read endpoints are fast, write endpoints degrade

This may indicate:

  • database write contention
  • queue bottlenecks
  • synchronous processing in request handlers
  • regional replication delays

Regional differences

If LoadForge is running traffic from multiple global test locations, compare:

  • North America vs Europe vs Asia latency
  • authenticated vs anonymous endpoint behavior
  • static vs dynamic response times

This is one of the biggest advantages of LoadForge for Fly.io load testing. Because Fly.io is globally distributed, you need distributed testing to validate the architecture. A single-region test won’t tell you whether your edge deployment is truly helping global users.

Correlate with Fly.io observability

As you review LoadForge’s real-time reporting, compare the results with Fly.io metrics and logs:

  • instance CPU and memory usage
  • request concurrency
  • app restarts
  • scaling events
  • backend database metrics
  • per-region instance distribution

This helps you determine whether the bottleneck is:

  • edge routing
  • application code
  • VM sizing
  • autoscaling configuration
  • downstream services

Performance Optimization Tips

Once your load testing reveals bottlenecks, these are the most common ways to improve Fly.io performance.

Place stateful dependencies closer to users

If your app is globally distributed but your database is in one region, dynamic endpoints may still be slow. Consider:

  • regional read replicas
  • caching hot data
  • moving latency-sensitive services closer to users

Tune connection pooling

Many performance issues on Fly.io come from database or upstream connection bottlenecks. Make sure your app has:

  • appropriate DB pool sizes
  • keep-alive enabled where relevant
  • async or nonblocking request handling if supported by your stack

Reduce cold-start impact

If load tests show ramp-up latency spikes:

  • keep a minimum number of instances warm
  • reduce startup time
  • preload dependencies at boot
  • avoid expensive initialization on first request

Separate ingestion from processing

For write-heavy APIs like event ingestion:

  • accept requests quickly
  • enqueue work asynchronously
  • process downstream jobs outside the request path

This improves perceived edge performance and makes stress testing results much more stable.

Cache aggressively where safe

For endpoints like:

  • project summaries
  • metrics dashboards
  • public pages
  • configuration lookups

Use caching to reduce repeated database reads and improve Fly.io edge responsiveness.

Test from multiple regions regularly

Because Fly.io is built for geographic distribution, performance optimization should always include global validation. LoadForge’s cloud-based infrastructure and global test locations make it easy to repeat the same test from different parts of the world and compare results over time.

Common Pitfalls to Avoid

Load testing Fly.io applications is straightforward, but there are several mistakes that can lead to misleading results.

Testing only one region

This is the biggest mistake. A Fly.io app may perform well from Virginia and poorly from Singapore. Always include distributed load testing if your users are global.

Ignoring backend geography

Even if Fly.io routes traffic to the nearest region, your app may still depend on:

  • a single-region Postgres instance
  • centralized Redis
  • third-party APIs hosted elsewhere

If you ignore those dependencies, you may misinterpret edge performance.

Using unrealistic traffic patterns

A health-check-only test won’t tell you much about real application behavior. Include:

  • authentication
  • database-backed reads
  • writes
  • upload flows
  • burst traffic

Reusing one auth token for all users

That can hide auth bottlenecks and produce unrealistic caching behavior. In most cases, each Locust user should log in independently or use a realistic token pool.

Forgetting warm-up effects

Fly.io apps may behave differently during the first few minutes of a test. Watch for:

  • startup delays
  • autoscaling transitions
  • connection pool initialization
  • cache warming

Load testing production without safeguards

If you test a live Fly.io production app:

  • use safe test accounts
  • avoid destructive endpoints
  • coordinate with your team
  • monitor scaling costs
  • set clear stop conditions

Focusing only on average response time

Average latency can look acceptable while p95 and p99 are terrible. For real user experience, percentiles matter far more than averages.

Conclusion

Fly.io gives developers a compelling platform for globally distributed applications, but edge deployment only delivers value if your app can handle real concurrency, regional traffic patterns, and backend dependency pressure. With the right load testing strategy, you can measure global latency, uncover scaling bottlenecks, validate edge performance, and improve reliability before users feel the impact.

Using LoadForge, you can build realistic Locust-based Fly.io load tests, run them from global locations, and analyze results with real-time reporting. Whether you’re testing a simple public app, an authenticated API, an event ingestion service, or an upload workflow, LoadForge makes it easier to understand how your Fly.io deployment performs under real-world load.

If you’re ready to validate your Fly.io architecture with practical performance testing and stress testing, try LoadForge and start building distributed tests that reflect how your users actually experience your application.

Try LoadForge free for 7 days

Set up your first load test in under 2 minutes. No commitment.