Find the screenshot the way you'd describe it.

Cairn captures a screenshot together with a voice tag. Later, type how you'd describe it — "the pricing chart Ivan sent me" — and Cairn finds it. Voice runs through Whisper on-device. Nothing leaves your Mac.

Download on the Mac App Store macOS 14+ · Apple Silicon · everything stays on your Mac
The Cairn viewer with the search 'ideas for shelter project' typed in. The sidebar shows five captures ranked by semantic match, each labeled Vivid, Clear, or Hazy with a score. The selected capture, 'The prototype of the shelter project,' shows a screenshot of a mobile cat-shelter app being designed.
The problem

You took the screenshot. Future-you can't find it.

Twenty a week. Dashboards, Linear tickets, Figma frames, Slack threads, slides from a deck. They land in ~/Desktop with the most useless filenames ever invented.

1Screenshot 2026-05-18 at 14.22.07.png812 KB
2Screenshot 2026-05-18 at 14.31.49.png1.1 MB
3Screenshot 2026-05-15 at 09.14.02.png644 KB
4Screenshot 2026-05-13 at 11.42.13.png901 KB
5Screenshot 2026-05-12 at 17.05.58.png733 KB
1,279 more files

Now find the one with the chart Ivan sent on Tuesday.

How it works

Capture. Tag. Find.

One shortcut. A quick voice note. A search bar that understands the way you actually talk about things.

01 · CAPTURE

Hit the shortcut.

Cairn captures the screen, runs OCR on what's visible, and starts listening — all in one move. No app to open, no window to focus.

+ + + R
02 · TAG

Say what it is.

Say it out loud. "Pricing chart from Ivan." Cairn transcribes it on-device with Whisper and stores it next to the image. Don't feel like talking? Type it on the same shortcut.

listening
03 · FIND

Type how you'd describe it.

You don't have to remember your exact words. Cairn matches on meaning, not on string equality — across your voice tag, the on-screen text, and AI-generated context. Said "pricing chart from ivan"? Searching "that revenue chart Ivan flagged" still finds it. Misspellings are fine. Phrasing is fine. Yesterday is fine. Last quarter is fine.

pricing chart from ivan
See it

Browse today. Find weeks later.

A clean library you can scan like a notebook — and a search that matches on meaning, not on string equality.

The Cairn viewer. A sidebar groups today's captures with thumbnails and voice-tag titles like 'Cats for the shelter project', 'The sentry issue, John sent me', 'The hooks for cat shelter project's landing'. The right pane shows the selected capture — a grid of cat photographs from a research session.
Browse like a notebook. Voice tags become the titles. Today, yesterday, last week — the way you'd actually scan back through a week of work.
The Cairn viewer with the search 'ideas for shelter project' typed in. The sidebar lists five matching captures ranked by semantic similarity, each labelled Vivid, Clear, or Hazy with a score from 0.62 down to 0.13. The selected capture, 'The prototype of the shelter project,' shows a screenshot of a cat-shelter mobile app being designed.
Find by meaning, not by string. Type how you'd describe it. Cairn ranks matches Vivid · Clear · Hazy so you see the strongest hit first and the “maybe this?” ones below.
The objection

Talking to your screen still feels weird.

So don't. The same shortcut takes voice or text — whatever the moment is for. Open-plan office, headphones in, AirPods in a meeting, kid asleep in the next room. Cairn doesn't care which mode you used; the tag is the tag.

On-device

Cairn is the deliberate, local-only alternative to passive recorders.

No accounts. No cloud transcription. No cloud embeddings. No telemetry. The app runs inside the macOS sandbox and only writes to its own container.

AI outputs (transcripts and image tags) are model-generated and can be inaccurate. See the Privacy Policy for details.

  • CAPTURENative macOS screen capture, via ScreenCaptureKit.
  • VOICE → TEXTOpenAI's Whisper model, running on-device. Vocabulary-biased toward your existing tags so jargon transcribes accurately.
  • ON-SCREEN TEXTApple Vision OCR, also local.
  • SEMANTIC INDEXall-MiniLM-L6-v2 embeddings, 384-dim, stored in sqlite-vec. Voice-tag weight bumped at query time.
  • STORAGESQLite + your PNGs, inside the macOS App Sandbox container. Move your Mac, your library moves with it.
  • NETWORKZero outbound after first launch. One-time model download from Hugging Face on first run; nothing else leaves your Mac.
~ · zsh
# after first launch (model download done): $ nettop -p $(pgrep Cairn) — no connections — $ lsof -i -P -p $(pgrep Cairn) — no entries — $ du -sh ~/Library/Containers/software.cairn.app 412M ~/Library/Containers/software.cairn.app # your screenshots, your transcripts, # your embeddings. all in there. nowhere else.
FAQ

The things people ask before buying.

What languages does the voice transcription support?
English only at launch. Cairn ships with the English-only Whisper model (whisper-small.en) because it’s noticeably more accurate on English than the multilingual variants at the same size — and most of the builders we’re shipping for work primarily in English. Multilingual support is on the roadmap. If you need it now, the type-instead-of-talk fallback works in any language.
Which Macs does it run on?
Apple Silicon Macs (M1 or later) running macOS 14 Sonoma or newer. The on-device models lean on the Neural Engine, so Intel Macs aren’t supported.
What about my existing screenshots on disk?
Cairn doesn’t touch them. It only indexes captures you take through its shortcut — that’s the moment the voice tag happens, and the voice tag is the whole point. Bulk-importing old screenshots without that context would just give you the same unfindable pile you already have. (A deliberate import flow may land later; it’s not in the MVP.)
Does Cairn record passively or in the background?
No. Every capture is intentional — triggered by you pressing the shortcut. No scheduled captures, no always-on screen scraping, no “record everything” mode. The microphone is only active during the few seconds you’re tagging a capture, and the LED indicator confirms it.
Does it work offline?
Yes — from the moment you install it. Both models (Whisper for voice, MiniLM for semantic search) ship inside the app bundle, so there’s no first-launch download and no network calls at all. Capture, transcribe, index, and search all run locally on a plane, in a tunnel, anywhere.
What if I don’t want to talk?
Type instead. The same shortcut opens a tag field that takes voice or text — whichever fits the moment. Open-plan office, headphones in, kid asleep next door: just type. The downstream search treats both kinds of tags identically.
Refunds?
Apple handles refunds for everything sold on the Mac App Store — request one at reportaproblem.apple.com within the standard refund window. If something’s broken or you can’t get there through Apple, email [email protected] and we’ll sort it out.
The deal

One-time purchase. No subscription. No login.

$ 14.99 once
  • Buy it once. No recurring charge, no auto-renew, no seat counts.
  • Lifetime updates. Including future major versions.
  • Works offline forever. No cloud means no kill-switch.
Download on the Mac App Store

Requires macOS 14 Sonoma or later. Apple Silicon.