🔭 Experimental

Vision Lab

Point your camera, ask a question. A 1.6 GB vision-language model (Moondream2, with SmolVLM-500M fallback) runs entirely on your device via WebGPU. Frames never leave the browser.

One-time setup

Downloads the model from HuggingFace's CDN — happens once.
Cached in your browser's persistent storage (Origin Private File System).
Subsequent visits load in ~3 seconds with no network.
Frames are processed on-device. The model's text answer is the only thing that leaves your browser (only when you press "Find compatible skills").

Checking your browser…

Model

Bandwidth tip: the download is large. On cellular it'll eat your data plan; prefer Wi-Fi.

Vision Lab

One-time setup

Loading model…

Answer