
Introduction
Datadog load testing integration gives engineering teams a much clearer view of what happens during a performance test. Instead of looking only at response times and request counts, you can correlate LoadForge test activity with Datadog infrastructure metrics, APM traces, logs, dashboards, monitors, and deployment events. That makes it much easier to answer questions like:
- Did CPU saturation cause the latency spike?
- Did a recent deployment increase error rates under load?
- Which service or endpoint became the bottleneck during stress testing?
- Did autoscaling react quickly enough when traffic increased?
For modern CI/CD and DevOps teams, this kind of visibility is essential. A load test without observability data tells you that something is slow. A Datadog-integrated load test helps tell you why.
In this guide, you’ll learn how to use LoadForge for load testing and performance testing while integrating with Datadog so you can correlate synthetic traffic with real application and infrastructure telemetry. We’ll cover basic and advanced Locust-based test scripts, realistic Datadog API usage patterns, and practical advice for analyzing results in both platforms.
Prerequisites
Before you begin, make sure you have the following:
- A LoadForge account
- A Datadog account
- A Datadog API key
- A Datadog application key
- A target application or API to test
- Datadog agents or integrations already sending metrics from your infrastructure, containers, or services
- Optional but recommended:
- Datadog APM enabled
- Datadog logs enabled
- Datadog dashboards and monitors configured
- CI/CD pipeline integration for automated performance testing
You should also know:
- The base URL of the system you want to load test
- Which endpoints represent realistic user traffic
- How your application authenticates users
- Which Datadog tags you want to use for correlation, such as:
env:stagingservice:checkout-apiteam:platformtest_run:release-2026-04-06
A good practice is to assign every load test a unique run identifier and send that identifier to both your application and Datadog. This makes it much easier to filter dashboards, logs, and traces during analysis.
Understanding Datadog Under Load
Datadog itself is not the primary system under test in most scenarios. Instead, Datadog acts as the observability layer that helps you interpret application behavior during load testing, stress testing, and performance testing. That said, there are a few common patterns where Datadog APIs are directly involved:
- Creating events to mark test start and finish
- Querying dashboards or metrics during or after a test
- Validating monitor state changes
- Sending custom metrics or annotations for test correlation
When your application is under load, Datadog helps surface bottlenecks such as:
Infrastructure Saturation
Look for:
- High CPU usage
- Memory pressure
- Disk I/O bottlenecks
- Network saturation
- Container throttling
- Pod restarts in Kubernetes
Application-Level Bottlenecks
Look for:
- Increased request latency by endpoint
- Rising error rates
- Slow database queries
- Queue backlog growth
- Connection pool exhaustion
- External dependency failures
Scaling and Deployment Issues
Datadog is especially useful in CI/CD and DevOps workflows because it lets you correlate performance changes with:
- New releases
- Autoscaling events
- Configuration changes
- Feature flag rollouts
- Background job spikes
In practice, LoadForge generates the traffic, while Datadog provides the telemetry context. LoadForge’s distributed testing and global test locations help simulate realistic traffic patterns, while Datadog shows how your services and infrastructure respond in real time.
Writing Your First Load Test
Let’s start with a practical baseline. This first script simulates users browsing an API-backed web application while also creating Datadog events at test start and stop. This is a realistic way to mark the timeline of a load test in Datadog so you can correlate spikes in metrics with the exact test window.
Basic application load test with Datadog event markers
from locust import HttpUser, task, between, events
import os
import time
import uuid
import requests
DATADOG_API_KEY = os.getenv("DATADOG_API_KEY")
DATADOG_APP_KEY = os.getenv("DATADOG_APP_KEY")
DATADOG_SITE = os.getenv("DATADOG_SITE", "datadoghq.com")
TEST_RUN_ID = os.getenv("TEST_RUN_ID", f"loadforge-{uuid.uuid4()}")
def send_datadog_event(title, text, tags=None, alert_type="info"):
if not DATADOG_API_KEY:
return
url = f"https://api.{DATADOG_SITE}/api/v1/events"
headers = {
"Content-Type": "application/json",
"DD-API-KEY": DATADOG_API_KEY,
"DD-APPLICATION-KEY": DATADOG_APP_KEY,
}
payload = {
"title": title,
"text": text,
"tags": tags or [],
"alert_type": alert_type,
"source_type_name": "loadforge"
}
requests.post(url, json=payload, headers=headers, timeout=10)
@events.test_start.add_listener
def on_test_start(environment, **kwargs):
send_datadog_event(
title="LoadForge test started",
text=f"Load test started for environment host={environment.host}\nRun ID: {TEST_RUN_ID}",
tags=[
"source:loadforge",
f"test_run:{TEST_RUN_ID}",
"env:staging",
"service:web-storefront"
]
)
@events.test_stop.add_listener
def on_test_stop(environment, **kwargs):
send_datadog_event(
title="LoadForge test completed",
text=f"Load test completed for environment host={environment.host}\nRun ID: {TEST_RUN_ID}",
tags=[
"source:loadforge",
f"test_run:{TEST_RUN_ID}",
"env:staging",
"service:web-storefront"
]
)
class StorefrontUser(HttpUser):
wait_time = between(1, 3)
def on_start(self):
self.headers = {
"User-Agent": "LoadForge-Locust/Datadog-Integration",
"X-Test-Run-Id": TEST_RUN_ID
}
@task(3)
def homepage(self):
self.client.get("/", headers=self.headers, name="GET /")
@task(2)
def product_listing(self):
self.client.get("/api/products?category=shoes&limit=24", headers=self.headers, name="GET /api/products")
@task(1)
def product_detail(self):
self.client.get("/api/products/sku_12345", headers=self.headers, name="GET /api/products/:id")What this script does
This script covers a simple but effective pattern:
- Simulates realistic user browsing behavior
- Adds a unique
X-Test-Run-Idheader to requests - Sends Datadog events at test start and stop
- Tags those events for filtering in dashboards and event streams
This approach works well because you can:
- Filter logs by
X-Test-Run-Id - Search APM traces for the same test run
- Overlay Datadog events on dashboards
- Compare LoadForge response times with Datadog host and service metrics
For many teams, this is the fastest way to connect load testing and observability.
Advanced Load Testing Scenarios
Once the basics are working, you can build more advanced Datadog-aware load tests. Below are several realistic scenarios for CI/CD and DevOps teams.
Scenario 1: Authenticated API testing with deployment correlation
This script simulates authenticated users logging in, browsing account data, and creating orders. It also posts a Datadog event that includes deployment metadata, which is useful in CI/CD pipelines.
from locust import HttpUser, task, between, events
import os
import uuid
import requests
DATADOG_API_KEY = os.getenv("DATADOG_API_KEY")
DATADOG_APP_KEY = os.getenv("DATADOG_APP_KEY")
DATADOG_SITE = os.getenv("DATADOG_SITE", "datadoghq.com")
TEST_RUN_ID = os.getenv("TEST_RUN_ID", f"release-test-{uuid.uuid4()}")
GIT_SHA = os.getenv("GIT_SHA", "unknown")
DEPLOY_ENV = os.getenv("DEPLOY_ENV", "staging")
def send_datadog_event(title, text, tags=None, alert_type="info"):
if not DATADOG_API_KEY:
return
url = f"https://api.{DATADOG_SITE}/api/v1/events"
headers = {
"Content-Type": "application/json",
"DD-API-KEY": DATADOG_API_KEY,
"DD-APPLICATION-KEY": DATADOG_APP_KEY,
}
payload = {
"title": title,
"text": text,
"tags": tags or [],
"alert_type": alert_type,
"source_type_name": "loadforge"
}
requests.post(url, json=payload, headers=headers, timeout=10)
@events.test_start.add_listener
def on_test_start(environment, **kwargs):
send_datadog_event(
title="CI performance validation started",
text=f"Performance validation started\nRun ID: {TEST_RUN_ID}\nGit SHA: {GIT_SHA}\nEnv: {DEPLOY_ENV}",
tags=[
"source:loadforge",
f"test_run:{TEST_RUN_ID}",
f"env:{DEPLOY_ENV}",
f"git_sha:{GIT_SHA}",
"service:orders-api"
]
)
class AuthenticatedApiUser(HttpUser):
wait_time = between(1, 2)
def on_start(self):
login_payload = {
"email": "perf-test-user@example.com",
"password": "SuperSecurePassword123!"
}
response = self.client.post(
"/api/v1/auth/login",
json=login_payload,
name="POST /api/v1/auth/login"
)
token = response.json()["access_token"]
self.headers = {
"Authorization": f"Bearer {token}",
"Content-Type": "application/json",
"X-Test-Run-Id": TEST_RUN_ID,
"X-Release-Version": GIT_SHA
}
@task(3)
def get_account(self):
self.client.get("/api/v1/account/profile", headers=self.headers, name="GET /api/v1/account/profile")
@task(2)
def get_orders(self):
self.client.get("/api/v1/orders?limit=10&status=completed", headers=self.headers, name="GET /api/v1/orders")
@task(1)
def create_order(self):
payload = {
"customer_id": "cust_100245",
"currency": "USD",
"items": [
{"sku": "sku_12345", "quantity": 1, "unit_price": 79.99},
{"sku": "sku_98765", "quantity": 2, "unit_price": 24.50}
],
"shipping_address": {
"line1": "123 Market St",
"city": "San Francisco",
"state": "CA",
"postal_code": "94105",
"country": "US"
},
"payment_method_id": "pm_test_visa_4242"
}
self.client.post("/api/v1/orders", json=payload, headers=self.headers, name="POST /api/v1/orders")This is a strong pattern for release validation in CI/CD because it lets you compare performance by deployment version. In Datadog, you can filter by git_sha or test_run to isolate a specific test.
Scenario 2: Querying Datadog monitors during a stress test
Sometimes you want your test to validate not only application behavior, but also operational readiness. For example, during a stress test you may want to confirm that Datadog monitors enter alert state when latency or error thresholds are breached.
This script simulates traffic against a checkout service while periodically checking the state of a Datadog monitor.
from locust import HttpUser, task, between
import os
import requests
DATADOG_API_KEY = os.getenv("DATADOG_API_KEY")
DATADOG_APP_KEY = os.getenv("DATADOG_APP_KEY")
DATADOG_SITE = os.getenv("DATADOG_SITE", "datadoghq.com")
TEST_RUN_ID = os.getenv("TEST_RUN_ID", "monitor-validation-run")
MONITOR_ID = os.getenv("DATADOG_MONITOR_ID", "12345678")
class CheckoutUser(HttpUser):
wait_time = between(1, 2)
def on_start(self):
self.headers = {
"Content-Type": "application/json",
"X-Test-Run-Id": TEST_RUN_ID
}
@task(5)
def cart_summary(self):
self.client.get("/api/v1/cart/summary?cart_id=cart_778899", headers=self.headers, name="GET /api/v1/cart/summary")
@task(3)
def shipping_rates(self):
payload = {
"cart_id": "cart_778899",
"destination": {
"postal_code": "10001",
"country": "US"
}
}
self.client.post("/api/v1/shipping/rates", json=payload, headers=self.headers, name="POST /api/v1/shipping/rates")
@task(2)
def checkout(self):
payload = {
"cart_id": "cart_778899",
"customer_id": "cust_778899",
"payment_token": "tok_visa_4242",
"billing_zip": "10001"
}
self.client.post("/api/v1/checkout", json=payload, headers=self.headers, name="POST /api/v1/checkout")
@task(1)
def check_datadog_monitor_status(self):
if not DATADOG_API_KEY or not DATADOG_APP_KEY:
return
url = f"https://api.{DATADOG_SITE}/api/v1/monitor/{MONITOR_ID}"
headers = {
"DD-API-KEY": DATADOG_API_KEY,
"DD-APPLICATION-KEY": DATADOG_APP_KEY
}
with self.client.get(
"/health",
headers=self.headers,
name="GET /health",
catch_response=True
) as response:
if response.status_code != 200:
response.failure("Application health endpoint failed")
monitor_response = requests.get(url, headers=headers, timeout=10)
if monitor_response.status_code == 200:
monitor_data = monitor_response.json()
overall_state = monitor_data.get("overall_state")
print(f"Datadog monitor {MONITOR_ID} state: {overall_state}")This pattern is useful when testing alerting behavior during stress testing. It helps answer questions like:
- Did the latency alert trigger?
- Did the error-rate monitor react fast enough?
- Were alerts too noisy or too slow?
Scenario 3: Sending custom Datadog metrics during a load test
In some environments, you want to push test metadata directly into Datadog as custom metrics. This can be helpful when you want dashboards to display load test phase markers, user counts, or synthetic business transaction rates.
from locust import HttpUser, task, between, events
import os
import time
import uuid
import requests
DATADOG_API_KEY = os.getenv("DATADOG_API_KEY")
DATADOG_SITE = os.getenv("DATADOG_SITE", "datadoghq.com")
TEST_RUN_ID = os.getenv("TEST_RUN_ID", f"custom-metrics-{uuid.uuid4()}")
def submit_datadog_metric(metric_name, value, tags=None):
if not DATADOG_API_KEY:
return
url = f"https://api.{DATADOG_SITE}/api/v1/series"
headers = {
"Content-Type": "application/json",
"DD-API-KEY": DATADOG_API_KEY
}
payload = {
"series": [
{
"metric": metric_name,
"points": [[int(time.time()), value]],
"type": "gauge",
"tags": tags or []
}
]
}
requests.post(url, json=payload, headers=headers, timeout=10)
@events.test_start.add_listener
def on_test_start(environment, **kwargs):
submit_datadog_metric(
"loadforge.test.started",
1,
tags=[
f"test_run:{TEST_RUN_ID}",
"env:staging",
"service:search-api"
]
)
@events.test_stop.add_listener
def on_test_stop(environment, **kwargs):
submit_datadog_metric(
"loadforge.test.completed",
1,
tags=[
f"test_run:{TEST_RUN_ID}",
"env:staging",
"service:search-api"
]
)
class SearchApiUser(HttpUser):
wait_time = between(1, 3)
@task(4)
def search_products(self):
params = {
"q": "running shoes",
"page": 1,
"per_page": 20,
"sort": "relevance"
}
headers = {
"X-Test-Run-Id": TEST_RUN_ID
}
self.client.get("/api/v1/search", params=params, headers=headers, name="GET /api/v1/search")
@task(2)
def search_filters(self):
params = {
"category": "footwear",
"brand": "nike",
"min_price": 50,
"max_price": 200,
"in_stock": "true"
}
headers = {
"X-Test-Run-Id": TEST_RUN_ID
}
self.client.get("/api/v1/search/filters", params=params, headers=headers, name="GET /api/v1/search/filters")This example is especially useful if you want to build Datadog dashboards that show:
- Test start and completion markers
- Per-run filtering by tags
- Correlation between synthetic load and backend performance
In larger organizations, these custom metrics can become part of a standardized performance testing workflow across services.
Analyzing Your Results
After your load test finishes, the real value comes from correlating LoadForge results with Datadog telemetry.
In LoadForge
Use LoadForge’s real-time reporting to review:
- Average and percentile response times
- Requests per second
- Error rates
- Endpoint-level performance
- Behavior over time during ramp-up and peak load
LoadForge’s cloud-based infrastructure and distributed testing are especially useful if you want to simulate traffic from multiple regions and compare how your application behaves globally.
In Datadog
Open dashboards, APM, logs, and infrastructure views for the same test window. Filter by:
test_run:<your-run-id>env:stagingorenv:production-likeservice:<service-name>git_sha:<release-version>
Focus on:
Infrastructure Metrics
- CPU utilization
- Memory usage
- Container restarts
- Pod CPU throttling
- Disk queue depth
- Network throughput
Application Metrics
- Request latency by endpoint
- Error count and error rate
- Throughput by service
- Database call duration
- Cache hit ratio
- Queue depth
APM Traces
Datadog APM is often the fastest way to identify why an endpoint slowed down. Look for:
- Slow spans
- N+1 database queries
- Downstream service latency
- Retry storms
- Lock contention
Logs
If your application logs the X-Test-Run-Id header, you can isolate exactly which log lines were generated by the load test. This is extremely helpful for debugging intermittent failures.
Events and Monitors
Review:
- Load test start and completion events
- Deployment events
- Monitor transitions
- Autoscaling events
The most useful analysis often comes from asking: what changed at the same time latency increased?
Performance Optimization Tips
When using Datadog with LoadForge for load testing and performance testing, these optimization tips often produce quick wins:
Tag everything consistently
Use the same tags across:
- Datadog events
- Custom metrics
- Logs
- APM traces
- Load test headers
A shared test_run tag makes analysis much easier.
Add a test run header
Pass a header like X-Test-Run-Id in every request. Then make sure your application logs or traces it.
Test realistic user journeys
Avoid testing only a single endpoint unless that truly reflects production. Mix read-heavy and write-heavy operations.
Include deployment metadata
Add release identifiers such as git_sha or build_number so you can compare versions in Datadog.
Watch percentiles, not just averages
Averages can hide serious user-facing problems. Pay close attention to p95 and p99 latency during stress testing.
Correlate with autoscaling
If your platform autoscales, verify:
- How quickly scaling starts
- Whether scaling actually reduces latency
- Whether scale-up causes its own instability
Use staged ramps
Instead of jumping from 0 to peak traffic instantly, increase load gradually. This makes bottlenecks easier to identify in Datadog graphs.
Automate in CI/CD
Run LoadForge tests automatically after deployments or before releases. With CI/CD integration, your team can catch regressions before they reach production.
Common Pitfalls to Avoid
Datadog-integrated load testing is powerful, but there are several mistakes teams commonly make.
Testing without correlation tags
If you don’t tag your test traffic, your dashboards and logs will be much harder to interpret.
Ignoring background noise
Shared staging environments can include traffic from other teams, cron jobs, or monitoring systems. Use filters and test-run identifiers to isolate your results.
Overloading Datadog APIs unnecessarily
Datadog APIs are useful for events, monitor checks, and custom metrics, but don’t poll them excessively from every virtual user. Keep Datadog API interactions lightweight and centralized where possible.
Using unrealistic authentication flows
Avoid fake or simplified login patterns that don’t match production. Use the same authentication methods your real clients use, such as bearer tokens or session-based login.
Failing to validate monitor behavior
A load test is also a chance to test operational readiness. If alerts don’t trigger properly during failure conditions, that is a production risk.
Looking only at the application layer
Many performance issues are caused by infrastructure constraints, database bottlenecks, or downstream dependencies. Datadog is valuable precisely because it helps you look beyond request timings.
Running tests from a single location only
If your users are global, use LoadForge’s global test locations to simulate realistic traffic distribution and observe regional differences in Datadog.
Conclusion
Datadog load testing integration with LoadForge gives DevOps and CI/CD teams a much more complete picture of system behavior under load. LoadForge generates realistic traffic with scalable, distributed cloud-based infrastructure, while Datadog provides the infrastructure metrics, traces, logs, dashboards, and monitor visibility needed to understand performance bottlenecks.
By combining realistic Locust scripts, test-run tags, Datadog events, and custom metrics, you can move from simple load testing to true observability-driven performance testing. That means faster root cause analysis, better release validation, and more confidence in your systems before they reach production.
If you want to correlate load test results with infrastructure and application metrics in a practical, scalable way, try LoadForge and build your next Datadog-integrated performance test today.
LoadForge Team
LoadForge is a load and performance testing platform built on Locust. Our team has been shipping load tests against production systems since 2018, and we write these guides from real customer engagements.
Related guides
Keep going with more guides from the same category.

ArgoCD Load Testing for Progressive Delivery
Combine ArgoCD and LoadForge to validate app performance during progressive delivery and Kubernetes rollouts.

How to Automate Load Testing in CircleCI
Use LoadForge with CircleCI to automate load testing in CI/CD and detect bottlenecks before production.

Jenkins Load Testing with LoadForge
Integrate LoadForge with Jenkins to automate load tests, fail slow builds, and improve release confidence.