DigitalOcean AI Performance Testing

This guide shows how to performance test DigitalOcean's AI platform using Locust. Perfect for testing response times, reliability, and capacity planning for AI workloads on DigitalOcean.

Use Cases

Test DigitalOcean AI API response times under load
Validate AI service reliability and uptime
Capacity planning for AI applications
Compare different Llama3 model variants
Monitor API rate limits and quotas

Simple Implementation

import random
from locust import HttpUser, task, between

class Llama3ChatUser(HttpUser):
    wait_time = between(1, 5)

    QUESTIONS = [
        "What is the capital of France?",
        "Translate 'Hello, how are you?' into Spanish.",
        "Who wrote 'Pride and Prejudice'?",
        "What's 13 multiplied by 17?",
        "Name three benefits of a vegan diet.",
        "Give me a quick summary of the plot of '1984'.",
        "Explain the concept of machine learning in simple terms.",
        "What are the main differences between Python and JavaScript?",
        "How does photosynthesis work?",
        "What are some tips for effective time management?"
    ]

    @task
    def chat_completion(self):
        question = random.choice(self.QUESTIONS)

        # Used to show a preview of the question in LF
        preview = (question[:20] + "...") if len(question) > 20 else question

        payload = {
            "model": "llama3.3-70b-instruct",
            "messages": [{"role": "user", "content": question}],
            "stream": False,
            "include_functions_info": False,
            "include_retrieval_info": False,
            "include_guardrails_info": False
        }
        headers = {
            "Content-Type": "application/json",
            "Authorization": f"Bearer {self.api_token}"
        }

        with self.client.post(
            "/api/v1/chat/completions",
            json=payload,
            headers=headers,
            name=f"chat: {preview}",
            catch_response=True
        ) as response:            
            if response.status_code == 200:
                response.success()
            else:
                response.failure(f"Status {response.status_code}")

         def on_start(self):
         # Set your DigitalOcean AI API token here
         self.api_token = "your-digitalocean-ai-token"

Setup Instructions

Get DigitalOcean AI Access:
- Sign up for DigitalOcean account
- Enable AI services in your project
- Generate an API token from the control panel
Configure the Script in LoadForge:
- Copy the script into LoadForge's test editor
- Replace your-digitalocean-ai-token with your actual API token
- Set the target host URL to your DigitalOcean AI endpoint
Configure Load Test Settings:
- Start with 1-5 virtual users to test connectivity
- Set appropriate ramp-up time to avoid rate limits
- Monitor response times and error rates

What This Tests

Response Times: Measure latency for Llama3.3 70B model
Throughput: Test concurrent request handling
Rate Limits: Understand DigitalOcean AI quotas and limits
Reliability: Check API stability under sustained load
API Performance: Validate DigitalOcean AI service quality

Expected Performance

Typical results for Llama 3.3 70B on DigitalOcean AI:

Response Time: ~3-6 seconds per request
Quality: High-quality responses with latest model improvements
Throughput: Suitable for production workloads with proper scaling

Rate Limits & Pricing

Request Limits: Varies by plan and model
Token Limits: Based on input/output tokens
Concurrent Requests: Limited per account tier
Pricing: Pay-per-use model based on tokens consumed

Common Issues

Authentication: Ensure API token has correct permissions
Rate Limiting: Start with low user counts to avoid 429 errors
Endpoint URLs: Verify the correct DigitalOcean AI endpoint
Token Limits: Monitor usage to avoid exceeding quotas

Best Practices

Gradual Ramp-up: Start with 1-5 users, increase gradually
Monitor Costs: Track token usage to avoid unexpected charges
Error Handling: Implement proper retry logic for production use
Caching: Consider caching responses for repeated queries

Product

Help

QA Series: Automated Security Headers Testing

QA Series: Basic Accessibility Monitoring

Use Cases

Simple Implementation

Setup Instructions

What This Tests

Expected Performance

Rate Limits & Pricing

Common Issues

Best Practices

DeFi Protocol Testing

Infura Node Testing

Ready to run your test?
Run your test today with LoadForge.

Product

Help

Recent posts

QA Series: Automated Security Headers Testing

QA Series: Basic Accessibility Monitoring

DigitalOcean AI Performance Testing

Use Cases

Simple Implementation

Setup Instructions

What This Tests

Expected Performance

Rate Limits & Pricing

Common Issues

Best Practices

DeFi Protocol Testing

Infura Node Testing

Ready to run your test? Run your test today with LoadForge.

Ready to run your test?
Run your test today with LoadForge.