feat(host): X11 capture backend + shared pipeline extraction

Extract the display-agnostic encode/mux tail out of wayland.rs into a new
host/pipeline.rs: CaptureHandle + lifecycle, audio routing setup, the gst
arg builder, the spawn, and Serve::bind now live there. Backends supply
only their video-source element args plus a post-spawn hook (Wayland uses
it to close its leaked pipewire fd; X11 passes a no-op). capture.rs
collapses to a thin dispatcher; its CaptureHandle enum is gone.

Add host/x11.rs: ximagesrc (use-damage=false show-pointer=true), whole
root window by default or a single window via --window (xwininfo
click-picker → xid). x11rb reads geometry for an info log, justifying the
previously-vestigial dep. No portal, no fd dance — capture starts
silently when the first viewer connects (the ticket is the access
control). Viewer is display-agnostic and unchanged.

Wire --no-hwencode for real (was a no-op): the shared tail now selects
x264enc(tune=zerolatency,ultrafast)/I420 vs vah264enc/NV12 and switches
the videoconvert target format to match. Applies to both backends.

deps.rs: check_host_binaries now takes &HostOpts and checks shared
elements for both backends, encoder by --no-hwencode, source per backend
(pipewiresrc/ximagesrc), and xwininfo only when X11 + --window. Install
hints added for x264enc, ximagesrc, xwininfo.

Verified: warning-free build; smoke test still passes (tail unchanged);
ximagesrc + both encoder tails produce mpv-decodable H.264 against an
Xwayland root. Interactive cross-machine end-to-end pending.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
2026-05-23 20:39:16 -04:00
parent 0c9d8eb9f9
commit cd127a9704
7 changed files with 474 additions and 247 deletions
+34 -9
View File
@@ -16,8 +16,10 @@ without spinning up a Discord call or fighting with NAT.
Working:
- Wayland capture via the screencast portal (KDE Plasma 6 confirmed; other
Wayland compositors with the portal should work but are untested)
- X11 capture via `ximagesrc` (whole screen, or a single window with
`--window`); selected automatically, or forced with `--display-server x11`
- VAAPI H.264 encode in GStreamer (RDNA3 confirmed; other VAAPI-capable
GPUs should work)
GPUs should work), with a software x264 fallback via `--no-hwencode`
- Audio capture of the default sink's monitor, with optional per-app
routing (`--app <name>`) and microphone mixing (`--mic`)
- `--repair` cleanup of orphaned PipeWire state left by a crashed host
@@ -30,8 +32,10 @@ Working:
`~/.config/pixelpass/config.toml` and used to auto-size the default
viewer cap
Not yet working:
- X11 capture (stubbed, returns an error — Phase 2 follow-up)
Not yet built (deferred, not blocking):
- Per-monitor selection on a multi-monitor X11 host — `ximagesrc` grabs the
whole root canvas; single-monitor cropping needs xrandr region coords
- `use-damage=true` CPU optimization for the X11 capture path
## Quick start
@@ -68,7 +72,7 @@ pixelpass <ticket>
## Requirements
- Linux (Wayland session for now; X11 stubbed)
- Linux (Wayland or X11; the backend is autodetected)
- A VAAPI-capable GPU and the right driver:
- AMD: `libva-mesa-driver`
- Intel: `intel-media-driver` (modern iGPUs) or `intel-vaapi-driver` (older)
@@ -117,10 +121,10 @@ cargo build --release
```
Host Viewer
──── ──────
Wayland portal (ashpd) ──> PipeWire fd
gst-launch: pipewiresrc -> videorate -> vah264enc ->
Wayland portal (ashpd) ──> PipeWire fd ─┐ (X11: ximagesrc, no portal)
gst-launch: <source> -> videorate -> vah264enc/x264enc ->
h264parse -> mpegtsmux
(audio: pulsesrc <sink>.monitor ->
avenc_aac -> aacparse ─┘)
@@ -142,7 +146,7 @@ cargo build --release
The viewer's player connects to a localhost HTTP server, which is
just one end of the iroh tunnel. The host's HTTP server sits on the
other end and streams GStreamer's stdout (an MPEG-TS containing
hardware-encoded H.264 + AAC) through with no demux or remux.
H.264 + AAC) through with no demux or remux.
iroh handles NAT traversal: direct UDP if hole-punching succeeds,
relay path otherwise. Both have been verified end-to-end.
@@ -201,6 +205,27 @@ If a host crashes mid-session it can leave orphaned `pixelpass_capture_*`
null-sinks and their paired loopbacks loaded in PipeWire. Run
`pixelpass --repair` to unload them and exit.
## Display server
The capture backend is autodetected from the environment
(`WAYLAND_DISPLAY` → Wayland, else `DISPLAY` → X11, else
`XDG_SESSION_TYPE`). Override it with `--display-server wayland|x11` — for
example to force the X11 path while running inside a Wayland session (an
Xwayland or Xephyr `DISPLAY`).
- **Wayland** goes through the screencast portal: a "Share Screen?" dialog
appears when the first viewer connects, and you pick the monitor (or
window, with `--window`) there.
- **X11** uses `ximagesrc` and starts silently when the first viewer
connects — the ticket is the access control, there's no portal gate.
`--window` runs an `xwininfo` picker (click the window you want to
share); without it the whole root window is captured.
Encoding is hardware VAAPI (`vah264enc`) by default. `--no-hwencode`
switches to software x264 (`x264enc tune=zerolatency`) for hosts without a
working VAAPI H.264 entrypoint — higher CPU, no GPU needed. This applies
to both backends.
## Multi-viewer
One gst capture pipeline fans out to N concurrent viewers via a