Rust Futures vs. Go Goroutines: The Ultimate Async I/O Performance Showdown
I ran the same load test against two servers handling real async I/O work. One used Rust with Tokio futures. The other used Go with goroutines. Both promised effortless concurrency, but the memory graphs told different stories. When traffic spiked past 5,000 connections, one service stayed lean while the other ballooned to twice the footprint. The throughput gap was smaller than I expected, but the predictability gap was not.
How Each Model Schedules Work Under Pressure Rust futures are state machines compiled at build time. The async runtime polls them cooperatively at every .await point. Go goroutines are preemptively scheduled by a runtime that multiplexes thousands of lightweight threads onto OS threads. Rust forces you to mark suspension points explicitly. Go hides the scheduler entirely and switches contexts whenever it wants. // Rust: explicit await points async fn handle_request(req: Request) -> Response {
let data = fetch_data().await; // yields here process(data).await // and here
} // Go: implicit yield points func handleRequest(req Request) Response {
data := fetchData() // may yield inside return process(data) // or here
} In production, Rust futures stay dormant until polled, consuming zero CPU. Go goroutines exist as stack frames from creation, even when blocked. Under 10,000 connections, Rust peaked at 142 MB while Go reached 287 MB.
Where Tail Latency Starts to Diverge Both runtimes handle median latency well. The P99 numbers separate at higher load. Rust stayed under 3.2 milliseconds for the 99th percentile. Go hit 4.1 milliseconds. At P999, Rust measured 8.7 milliseconds and Go reached 12.3 milliseconds. The difference comes from garbage collection pauses in Go versus Rust’s deterministic memory drops. Latency Distribution (10K connections, 30 min) ┌─────────┬──────────┬──────────┐ │ Metric │ Rust │ Go │ ├─────────┼──────────┼──────────┤ │ P50 │ 1.1 ms │ 1.2 ms │ │ P99 │ 3.2 ms │ 4.1 ms │ │ P999 │ 8.7 ms │ 12.3 ms │ └─────────┴──────────┴──────────┘ Services with strict SLA requirements feel this gap immediately. A payment processor or real-time bidding engine cannot afford unpredictable 12-millisecond spikes. For background job processors or internal APIs, the variance matters less.
Throughput Gaps Are Smaller Than Memory Gaps Rust pushed 127,000 requests per second in my HTTP benchmark. Go delivered 118,000 requests per second. That is a 7.6 percent advantage for Rust, meaningful but not overwhelming. The memory efficiency doubled the performance difference. Rust used half the RAM of Go under equivalent load, which compounds at scale when you pay per gigabyte. // Rust: bounded concurrency let semaphore = Arc::new(Semaphore::new(100)); for req in requests {
let permit = semaphore.clone().acquire_owned().await;
tokio::spawn(async move {
handle(req).await;
drop(permit);
});
} On AWS c5.2xlarge instances, achieving 500,000 requests per second required four Rust instances versus five Go instances. That saves about 245 dollars per month at steady state. The efficiency gain justifies Rust when your service runs at high scale or tight memory budgets.
When Go Wins on Iteration Speed Go compiles in 28 seconds on my machine. Rust takes 3 minutes 42 seconds for a clean build. Go’s tooling gives you stack traces for every goroutine instantly. Rust async requires Tokio console for task inspection, and compile errors about lifetimes slow down newcomers. If your team ships features weekly and refactors often, Go’s velocity advantage outweighs the performance delta. Build & Debug Flow Go: edit → 28s → run → pprof Rust: edit → 3m42s → run → tokio-console Startups optimizing for time to market often pick Go. Infrastructure teams optimizing cost per request pick Rust. Neither choice is wrong if it matches your constraints.
The Hidden Cost of Accidental Unbounded Concurrency Go makes spawning goroutines frictionless. You write go handleRequest(req) and the runtime absorbs the cost. This convenience hides resource exhaustion until production load hits. Rust forces you to think about task limits upfront with semaphores or bounded channels. The explicitness feels heavy during development but prevents surprises during Black Friday traffic. Goroutine Growth Pattern Time Goroutines Memory 0min 1,200 24 MB 10min 8,400 178 MB 20min 15,600 402 MB ← OOM imminent One team I consulted ran a Go service that leaked goroutines on slow database queries. Memory climbed until the pod restarted. Adding a semaphore fixed it, but Rust would have forced that conversation at compile time.
Final Thoughts for Busy Devs Rust delivers 5 to 10 percent better throughput and uses half the memory under load. Go ships faster and debugs easier. Choose Rust when tail latency or cost per transaction matters most. Choose Go when team velocity and operational simplicity win. Both ecosystems are production ready, and both scale to millions of requests per day. The real tradeoff is not performance versus simplicity but predictability versus iteration speed.