Servers Under Pressure: Realistic Load-Testing Three Hetzner VPS - Kernova

When evaluating the real performance of a server deployment on Hetzner VPS, intuition is not enough. You need data. You need to simulate real traffic patterns and understand how your infrastructure behaves under pressure. For this reason, I used Locust to benchmark three different Hetzner WordPress VPS instances, each with different CPU and RAM configurations. The goal was simple: determine how much load each server can handle before performance degrades, latency increases, or failures start appearing. This blog post walks through the Locust setup, explains how load testing works, and compares the results across three Hetzner servers.

Table of Contents

Locust: What It Is and Why It Matters

Locust is a modern load-testing framework built with Python, designed to help you simulate real user behavior and understand how your backend holds up under stress. Instead of guessing whether your VPS can handle traffic spikes, Locust gives you measurable metrics: response time percentiles, request throughput, failure rates, tail latency, and maximum observed delays.

Locust is particularly valuable in business environments where user experience, conversion rates, and platform reliability must be quantifiable. If the server cannot handle concurrent traffic, the business takes the hit—slower pages mean lower rankings, abandoned sessions, and lost transactions. Locust also provids chart for each test to better data understanding.

Test Configuration and Scenario

The testing scenario was consistent across all servers to ensure a fair comparison. The configuration used in Locust was like the following.

Peak concurrent users: 100
Ramp-up rate: 10 users per second
Test duration: 3 minutes
Target: Hetzner WordPress VPS at https://wp.corefortify.com/

Locust Test Configuration Summary (Users, Ramp-Up, Runtime, Target Host)

This configuration applies consistent pressure by hitting the WordPress backend hard enough to expose any CPU, PHP-FPM, or database bottlenecks.

Why 100 Users?

The goal was not to stress the server to unrealistic extremes, but to test a moderate, real-world marketing-campaign level load. One hundred concurrent users represent a scenario where content posts go viral, newsletters are sent, or seasonal traffic peaks occur. It is enough to test scalability, caching efficiency, PHP-FPM behavior, and resource exhaustion.

Servers Under Test

Three Hetzner VPS machines were benchmarked.

Server	Model	vCPU	RAM	Monthly Cost (USD)
Server1	CX23	2 vCPU	4 GB	$4.09
Server2	CPX22	2 vCPU	4 GB	$7.59
Server3	CPX32	4 vCPU	8 GB	$12.59

The test aims to determine how hardware differences impact performance under identical load conditions.

Test 1: Server 1 (CX23 – 2 vCPU, 4 GB RAM)

Results Highlights

Total Requests: 6060
Failures: 7
Median Response: 2500 ms
95th Percentile: 2900 ms
99th Percentile: 5800 ms
Max Latency: 125 seconds
Average Response: 2794 ms
Sustained RPS: 33.7
Failure Rate: 0.11%

Server 1 Locust Results: Response Times, Throughput, and Failure Rate

Interpretation

Struggles early once concurrency increases.
Median response already slow (2.5s).
Tail latency jumps to nearly 6s (P99).
Extreme spikes: max delay reaches 125 seconds.
CPU/PHP-FPM likely saturated under load.
Stable only up to 25–35 users before performance collapses.

Test 2: Server 2 (CPX22 – 2 vCPU, 4 GB RAM, Better CPU Generation)

CPX-series machines use a faster CPU generation compared to CX-series, even when vCPU count and RAM remain identical.

Results Highlights

Total Requests: 8428
Failures: 0
Median Response: 1800 ms
95th Percentile: 2000 ms
99th Percentile: 6500 ms
Max Latency: 106 seconds
Average Response: 2053 ms
Sustained RPS: 55
Failure Rate: 0%

Server 2 Locust Results: Response Times, Throughput, and Failure Rate

Interpretation

Better CPU generation improves response times.
Zero failures during testing.
Tail latency still spikes above 6.5s at full load.
Max delay remains high at 106 seconds.
Handles more throughput than CX23 but still overloads near 100 users.
Realistic limit: 55–60 users.

Test 3: Server 3 (CPX32 – 4 vCPU, 8 GB RAM)

With double the CPU power and double the RAM, Server 3 is positioned as the strongest configuration. It provides more PHP-FPM workers, higher cache efficiency, and better queue handling under load.

Results Highlights

Total Requests: 17160
Failures: 0
Median Response: 890 ms
95th Percentile: 1100 ms
99th Percentile: 4300 ms
Max Latency: 52 seconds
Average Response: 991 ms
Sustained RPS: 99
Failure Rate: 0%

Server 3 Locust Results: Response Times, Throughput, and Failure Rate

Interpretation

Sub-second median response (0.89s).
P95 remains close to 1s under load.
No failures even at high concurrency.
Tail latency increases only when pushed to full 100-user load.
Double CPU and RAM significantly improve stability.
Realistic limit: 85–100 users.

Cross-Server Performance Comparison

Metric	Server 1 CX23	Server 2 CPX22	Server 3 CPX32
Average Response	~2.8s	2s	1s
P95	~2.9s	2s	1.1s
P99	~5.8s	6.5s	4.3s
Failures	0.11%	0%	0%
Real User Capacity	30–40	45–55	80–120

Final Conclusion

The key takeaway is straightforward: if your WordPress site or application expects regular traffic peaks, launches, or marketing spikes, the CPX32 is the minimum configuration that provides reliable, real-world scalability. Using Locust to validate these limits removes guesswork and ensures your infrastructure decisions are based on measurable evidence, not assumptions.

Locust: What It Is and Why It Matters

Test Configuration and Scenario

Why 100 Users?

Servers Under Test

Test 1: Server 1 (CX23 – 2 vCPU, 4 GB RAM)

Results Highlights

Interpretation

Test 2: Server 2 (CPX22 – 2 vCPU, 4 GB RAM, Better CPU Generation)

Results Highlights

Interpretation

Test 3: Server 3 (CPX32 – 4 vCPU, 8 GB RAM)

Results Highlights

Interpretation

Cross-Server Performance Comparison

Final Conclusion

Ali Alrahbe

Related Posts

Data Pipeline for Processing CSV Files Using S3, Lambda, Glue, and QuickSight

AWS Serverless Event Management System

Detecting SSH Brute-Force Attempts with a Minimal Bash Script