Headless vs Headful: When Browser Use's Agent Goes Silent

You launch your agent. The logs show the LLM call succeeded. The agent reports "Step 1 completed." But nothing actually happened — the page never loaded, the element was never clicked. No error. No crash. Just silence. This is the most common Browser Use production failure mode, and nine times out of ten, it's headless Chrome.

Headless mode skips the entire GPU rendering pipeline. For most static pages this is fine — the DOM still loads, JavaScript still executes. But modern JS-heavy sites (anything built with React, Next.js, or behind Cloudflare) increasingly rely on WebGL fingerprinting, canvas rendering, and Intersection Observer APIs that depend on a real viewport with actual pixel dimensions. In headless mode, these APIs either return null or produce subtly different values that trigger the page's bot detection heuristics. The page loads, but it loads a degraded version that doesn't match what the Browser Use agent's LLM expects to see — so the agent waits for an element that never renders.

The fix is a virtual display. Xvfb (X Virtual Framebuffer) creates a fake screen in memory that Chrome renders to exactly as it would on a real monitor. Pair it with a lightweight window manager like Fluxbox, and Chrome has everything it expects: a display server, window decorations, and correct screen geometry. The agent now runs in "headful" mode — Chrome thinks it's on a real desktop, so every JS API works correctly — but there's no physical monitor attached. The cost: roughly 200MB more RAM per agent compared to pure headless mode. For a single-agent deployment this is negligible. At 10+ concurrent agents, it adds up — but it's still cheaper than debugging silent failures in production.

The trade-off between headless and headful is ultimately about failure modes. Headless fails silently — your agent reports success while accomplishing nothing. Headful fails visibly — if the browser crashes, you can VNC into the virtual display and see exactly what happened. For production pipelines where each task costs real money (LLM API calls, proxy bandwidth, engineering time), the extra 200MB of RAM is the cheapest insurance you can buy.

# Start Xvfb on virtual display :99 (1920x1080, 24-bit color)
Xvfb :99 -screen 0 1920x1080x24 &

# Optional: start a lightweight window manager for proper window behavior
fluxbox -display :99 &

# Tell Chrome to use the virtual display
export DISPLAY=:99

# Now launch your Browser Use agent — it runs headful on the virtual screen
python agent.py

🔒

Unlock the full chapter. Get the complete Xvfb + Fluxbox Docker Compose sidecar setup, VNC debugging configuration, headless vs headful benchmark data across 50 real-world websites, and a decision matrix for when each mode is appropriate.

Get the Production Guide — $39