Jump to content

Java vs Rust: I Rewrote Our App and Saved My Company $2M/Year

From JOHNWICK

I didn’t bet on a language. I bet on shape: one hop, strict backpressure, lean memory. Rust helped enforce that shape. Java (with virtual threads) stayed where our team moves fastest. The win came from how the work flows, not what the syntax looks like.

I’m going to show you the map, the code, and the numbers. Then you can steal the shape for your stack. What Was Breaking (And Why It Hurt)

Traffic grew faster than our discipline. Latency spikes arrived in waves. Autoscaling kept adding heat. The cost curve stopped listening to reason. CPU was fine. GC was fine. The real problem lived between services: hops, queues, retries, and JSON churn. Every small wait multiplied across the mesh until a simple “confirm payment” felt like dragging a ship through wet sand. I realized we were paying for movement, not for work.


The Cut That Changed Everything I carved the hot path out of the mesh:

  • Auth → risk → charge → ledger became a single execution flow behind an admission gate.
  • The hot path moved to Rust for very compact memory and fewer accidental allocations.
  • The rest stayed in Java, where we ship fast and stay safe.

Was Rust necessary? For this path, yes. For the platform, no. The cheaper shape would have paid off even in Java — Rust just made it easier to hold the line on memory and copies.

Before And After (Hand-Drawn Map) BEFORE +--------+ +-----------+ +-----------+ +-----------+ +-------+ | Client |-->| Gateway |-->| Spring #1 |-->| Spring #2 |-->| Kafka | +--------+ +-----------+ +-----------+ +-----------+ +-------+

                                  \            \                \
                                   \------------> Spring #3 -----> RDS

Notes: Many hops. Bursty queues. Tail latency drifts. Costs drift with it.


AFTER +--------+ +-----------+ +--------------------+ +-------+ | Client |-->| Gateway |-->| Rust Hot Path |----->| RDS | +--------+ +-----------+ | (auth+risk+charge) |----->| Kafka |

                            +--------------------+      +-------+
                                    \
                                     \----> Java services (reports, refunds, admin)

Notes: One admission gate. Fewer copies. Fewer hops. Boring tails.

The Receipts (Rounded, Anonymized, Honest) | Item | Before (Mesh) | After (Hybrid) | Change | |-----------------------------|--------------:|---------------:|-------------:| | Nodes (m5.large equiv) | 220 | 96 | -124 | | Hot-path pods | 120 | 36 | -84 | | Avg RSS per hot pod | ~1.3 GiB | ~0.12 GiB| -1.18 GiB | | Cross-AZ calls / request | 5–8 | 2–3 | -3 to -5 | | p99 “confirm” at peak load | 3.2 s | 380 ms | -2.82 s | | Cost / 1M requests (est.) | $1,780 | $210 | -$1,570 | | Annualized delta (est.) | | | ≈ $2M saved |

Numbers are rounded. The method is simple: node hours × unit price, plus egress and IOPS. Map every hop to dollars. Remove hops. Count again.

Code That Holds The Line (Java 21 Version)

Virtual threads make Java honest and fast when the shape is right. This is the handler that forced discipline: hard admission, structured concurrency, single hop for critical work. package app.pay;

import lombok.RequiredArgsConstructor; import lombok.extern.slf4j.Slf4j; import org.springframework.web.bind.annotation.*; import org.springframework.http.ResponseEntity; import javax.sql.DataSource; import java.util.concurrent.Semaphore; import java.util.concurrent.StructuredTaskScope;

@RestController @RequestMapping("/v1") @RequiredArgsConstructor @Slf4j class PayController {

 private final DataSource ds;
 private final Semaphore gate = new Semaphore(5_000); // admission line
 @PostMapping("/pay")
 ResponseEntity<?> pay(@RequestBody PayReq req) throws Exception {
   if (!gate.tryAcquire()) return ResponseEntity.status(429).build();
   try (var scope = new StructuredTaskScope.ShutdownOnFailure();
        var conn = ds.getConnection()) {
     var userF = scope.fork(() -> UserRepo.load(conn, req.userId()));
     var riskF = scope.fork(() -> RiskClient.check(req));
     scope.join().throwIfFailed();
     if (!riskF.get().approved()) return ResponseEntity.status(402).build();
     var txId = Ledger.insert(conn, req, userF.get());
     scope.fork(() -> Events.publish(txId));
     return ResponseEntity.ok(new PayResp(txId));
   } finally {
     gate.release();
   }
 }

}

Why it worked: the gate flattened spikes, virtual threads removed pool thrash, and the flow deleted glue that never should have been split.

Code That Enforces The Shape (Rust + Axum)

Same shape. Different tool. Very compact memory. Fewer accidental copies.

Admission at the door. use axum::{routing::post, Router, Json, extract::State}; use serde::{Deserialize, Serialize}; use std::sync::Arc; use tokio::{sync::Semaphore, join};

  1. [derive(Deserialize)] struct PayReq { user_id: String, amount: i64 }
  2. [derive(Serialize)] struct PayResp { tx_id: String }
  1. [tokio::main]

async fn main() {

   let gate = Arc::new(Semaphore::new(5_000));
   let app = Router::new().route("/v1/pay", post(pay)).with_state(gate);
   axum::Server::bind(&"0.0.0.0:8080".parse().unwrap())
       .serve(app.into_make_service()).await.unwrap();

}

async fn pay(State(g): State<Arc<Semaphore>>, Json(req): Json<PayReq>)

 -> Result<Json<PayResp>, axum::http::StatusCode> {
   let _p = g.try_acquire_owned().map_err(|_| axum::http::StatusCode::TOO_MANY_REQUESTS)?;
   let (user, risk) = join!(load_user(&req.user_id), check_risk(&req));
   if !risk.approved { return Err(axum::http::StatusCode::PAYMENT_REQUIRED); }
   let tx = insert_tx(&req, &user).await.map_err(|_| axum::http::StatusCode::BAD_GATEWAY)?;
   tokio::spawn(async move { let _ = publish_event(tx.clone()).await; });
   Ok(Json(PayResp { tx_id: tx }))

}

Why it worked: the language nudged us toward compact models, honest backpressure, and shallow async that never ballooned into complexity.

What Actually Drove The Savings

Backpressure Beat Autoscaling. We stopped pretending infinite work could fit in a finite minute. The gate said “not now” instead of letting retries swarm.

One Hop, Not Many. Auth, risk, and charge belonged together. Pulling them into one flow removed egress, reduced copies, and silenced half the alerts.

Memory Density. The hot path held its resident set very low and steady. That let us pack more work per node without noisy neighbors.

Team Speed Still Matters. We kept everything else in Java. Features stayed fast. The bill stayed low. The rewrite was not a flag planted in a language war; it was a line drawn around a path that prints revenue.

A Simple Benchmark Story People Can Trust

I won’t bury you in graphs. Here’s the picture that moved the room: | Load Scenario | Before p99 | After p99 | Errors | Notes | |--------------------------------|-----------:|----------:|-------:|-----------------------| | Steady 1k rps | 620 ms | 170 ms | <0.1%| Flat pods, calm CPU | | Burst to 3k rps (short surge) | 3.2 s | 420 ms | 1.1%| Gate returns 429s | | Sustained 2k rps | 1.9 s | 360 ms | 0.4%| No scale thrash |

The key is not the tool. It is the shape. Add a hard gate. Cut hops. Measure p99 and cost per million requests. The rest follows. The Human Part I Don’t Want To Forget

I remember the knot in my stomach when I decided to ship. I remember opening the graph and seeing the line bend the right way. I remember sleeping through the night again after months of alarms that rang like a dare. Most of all, I remember the lift in the team’s voice. We stopped fighting the platform. We started designing the path.

What I Would Tell Past Me

Keep Java where you fly. Move the hottest lane into a shape that refuses waste. If you need very compact memory and strict control, use Rust. If you want to stay on JVM, use virtual threads and hold the same line. Either way, admit less work than you can finish and do the work in as few hops as possible.

That decision saved real money. That decision made the graphs boring again. Boring graphs are beautiful. If You Want To Try This Tomorrow Morning

Start with the gate. Place it at the first touchpoint of your hottest endpoint. Watch p99 calm down. Then fold hops together until the path feels like a straight road instead of a maze.

When your numbers move, tell me what changed. Share your p99. Share your resident memory. Share your cost per million requests. I want to see your curve bend too.

Notes On Data

  • Figures are rounded and anonymized.
  • The method is consistent: node hours, transfer, storage I/O, and operational overhead.
  • Replicate the method with your inputs; expect different absolute numbers, but watch for the same curve.


Final Word This wasn’t a gamble on novelty. It was a promise to stop wasting cycles on movement and spend them on work. The language helped. The shape won. I rewrote our app where it mattered, and a heavy bill became a light one. If your path earns your revenue, guard it like your future depends on it — because it does.