8 WASM + Rust Techniques for Native-Speed UIs
You click. The UI answers instantly. No jank, no “thinking” spinner. That feeling isn’t luck — it’s a set of choices. If you’re shipping Rust to the browser with WebAssembly, these are the eight techniques that repeatedly turn prototypes into snappy, production-grade UIs.
1) Zero-copy bridges: share views, not bytes Calling Rust from JS is cheap; moving data isn’t. Pass typed views over raw buffers instead of cloning. Pattern
- Store large data inside the WASM linear memory.
- In JS, create a Uint8Array view into that memory using its pointer and length.
- In Rust, expose getters that return pointers/lengths; avoid copying into Vec on the JS side.
// Cargo.toml: wasm-bindgen = "0.2", js-sys = "0.3" use wasm_bindgen::prelude::*;
- [wasm_bindgen]
pub struct Frame { ptr: *const u8, len: usize }
- [wasm_bindgen]
impl Frame {
#[wasm_bindgen(getter)]
pub fn ptr(&self) -> *const u8 { self.ptr }
#[wasm_bindgen(getter)]
pub fn len(&self) -> usize { self.len }
} // JS side const mem = new Uint8Array(wasm.memory.buffer); const view = mem.subarray(frame.ptr, frame.ptr + frame.len); // zero-copy view Why it works: You keep the hot path on a single buffer and avoid GC pressure from temporary copies.
2) SIMD for hot loops: enable simd128 Rust’s std::arch::wasm32 lets you write vector ops directly, or you can rely on auto-vectorization (often good, not magical). Recipe
- Build for wasm32-unknown-unknown with RUSTFLAGS="-C target-feature=+simd128".
- Use wide operations for pixel transforms, physics, and DSP-ish tasks.
- [target_feature(enable = "simd128")]
pub unsafe fn add_rgba(a: v128, b: v128) -> v128 {
core::arch::wasm32::i8x16_add(a, b)
} Tip: Keep arrays aligned and use SoA (structure-of-arrays) for large numeric loops to help the compiler vectorize.
3) Frame-paced interop: batch events per requestAnimationFrame Crossing the WASM/JS boundary per mousemove is death by a thousand cuts. Accumulate events in JS, consume them once per frame in Rust. // JS const queue = []; canvas.addEventListener('pointermove', e => queue.push([e.clientX, e.clientY]));
function tick(ts) {
wasm.consume_events(queue); // one boundary crossing queue.length = 0; wasm.update_and_draw(ts); requestAnimationFrame(tick);
} requestAnimationFrame(tick); Why it works: You maintain a ≤16.7ms budget (60 FPS) and keep work coherent with the browser’s scheduler.
4) OffscreenCanvas + Worker: move paint off the main thread When UI logic must stay responsive, render in a Worker with OffscreenCanvas. Rust runs inside the worker; main thread stays free for input. // main thread const worker = new Worker('renderer.js', { type: 'module' }); const off = canvas.transferControlToOffscreen(); worker.postMessage({ canvas: off }, [off]); // renderer.js (worker) import init, { draw_frame } from './pkg/app.js'; onmessage = async ({ data }) => {
const { canvas } = data;
await init();
const ctx = canvas.getContext('2d');
function loop(t){ draw_frame(ctx, t); requestAnimationFrame(loop); }
requestAnimationFrame(loop);
}; Note: Requires cross-origin isolation for some features; still excellent for decoupling.
5) WebGPU for big visuals; fall back to WebGL2 For heavy scenes, skip the CPU. Use wgpu (Rust) targeting the WebGPU backend, which maps to the browser’s GPU API. Minimal sketch (Rust): // Pseudocode-ish: the wgpu setup is verbose; keep it once, reuse everywhere. pub async fn init_gpu(canvas: web_sys::HtmlCanvasElement) -> (wgpu::Device, wgpu::Queue) {
let instance = wgpu::Instance::default(); let surface = instance.create_surface_from_canvas(&canvas).unwrap(); // request adapter/device, configure surface... // create pipelines/buffers; draw in your RAF loop // return device/queue for later commands unimplemented!()
} Why it works: Big charts, image filters, and particle systems move to GPU pipelines; CPU remains free for UI state.
6) Allocators for UI churn: prefer arenas/bump for transient state Frequent short-lived allocations (menus, tooltips, per-frame scratch) punish performance. Use an arena or bump allocator for per-frame scratch, then reset. use bumpalo::Bump;
thread_local! { static SCRATCH: Bump = Bump::new(); }
pub fn layout_frame() {
SCRATCH.with(|bump| {
// allocate short-lived structures here
let v = bump.alloc([0u8; 64]);
// ...
bump.reset(); // free all at once at end of frame
});
} Payoff: Predictable perf, fewer calls into the default allocator, less GC on the JS side.
7) Compute-then-patch: send tiny DOM diffs, not full trees The DOM is slow because it’s stateful. Compute minimal patches in Rust, then apply via web-sys in one go.
- [derive(Clone)]
pub enum Patch { SetText(u32, String), SetAttr(u32, String, String) /* ... */ }
pub fn diff(old: &Ui, new: &Ui, out: &mut Vec<Patch>) {
// add only what changed
} // JS apply function apply(patches) {
for (const p of patches) {
switch (p.kind) {
case 'SetText': nodes[p.id].textContent = p.value; break;
// ...
}
}
} Mindset: Treat the DOM like a device driver; update in bulk, not piecemeal.
8) Build flags & guardrails: stability is a feature You’ll rarely “optimize” your way past bad builds. Set sane defaults.
- panic = "abort" (no unwinding in hot paths)
- codegen-units = 1, lto = "fat", opt-level = "z" for size-critical or "s"/3 for speed
- Enable wasm-opt -O3 --enable-simd post-build if it’s in your toolchain
- Use bf16/f16 textures with WebGPU where visuals allow it; halves bandwidth
- Track max_frame_time and update_time and fail CI if frame pacing regresses
Cargo.toml (release profile): [profile.release] opt-level = 3 lto = "fat" codegen-units = 1 panic = "abort"
A tiny example: high-FPS image processing tool Goal: drag a slider, watch a 4K image sharpen in real time.
- Pixels live in a single Vec<u8> in WASM.
- A SharedArrayBuffer (when allowed) carries control messages; otherwise, frame-paced queues.
- The sharpen kernel is SIMD’d (Technique 2).
- Rendering happens in a Worker on an OffscreenCanvas (Technique 4).
- UI buttons live on the main thread; we send one patch batch per frame (Technique 7).
- Build with lto, panic=abort (Technique 8).
Result: 60 FPS on a mid-range laptop, zero main-thread jank, no “fuzzy” UI after resize.
Code nuggets you’ll reuse Calling requestAnimationFrame from Rust once per frame use wasm_bindgen::{prelude::*, JsCast}; use web_sys::window;
pub fn raf(f: &Closure<dyn FnMut(f64)>) {
window().unwrap().request_animation_frame(f.as_ref().unchecked_ref()).unwrap();
} Storing a persistent JS callback (avoid alloc every frame) thread_local! {
static RAF: std::cell::RefCell<Option<Closure<dyn FnMut(f64)>>> = Default::default();
} Safe-ish pointer export
- [wasm_bindgen]
pub fn buffer_ptr(v: &Vec<u8>) -> *const u8 { v.as_ptr() }
- [wasm_bindgen]
pub fn buffer_len(v: &Vec<u8>) -> usize { v.len() } (Expose through a wrapper type in real code; keep lifetimes clear.)
Performance heuristics (pin this)
- Move less: it’s almost always the copies.
- Frame-gate work: everything funnels through requestAnimationFrame.
- Measure: log worst-case frame time, not just averages.
- Prefer SoA + SIMD: predictable, cache-friendly, vectorizable.
- One bulk DOM patch: treat JS like an I/O boundary.
- Workers for paint: main thread = input & accessibility.
A quick story We rebuilt a dashboard heatmap that jittered on modest laptops. The fix wasn’t a single trick; it was a sequence: moved decoding and colorize to Rust with SIMD; rendered via WebGPU; passed the image buffer by view; and applied UI updates once per frame. Same design, same dataset — but the new build stayed at 60 FPS while cutting CPU usage by more than half. Users noticed. They didn’t know why. They just stopped waiting.
Conclusion Rust + WASM can absolutely feel native, but only if you treat the browser like a realtime system: control copies, align with frames, keep the GPU busy, and treat the DOM as a slow device. Start with one or two techniques above — zero-copy views and frame-paced interop are the easiest wins — then layer Workers, SIMD, and WebGPU as your features demand. CTA: Have a stuttery interaction or chart that just won’t hit 60 FPS? Share the hot path and I’ll suggest a plan.
Read the full article here: https://medium.com/@Nexumo_/8-wasm-rust-techniques-for-native-speed-uis-068780964fe5