v0.9.3 — Better media search, scoped retries, paid-tier control

This release is about making search actually find what you mean and giving you control over the pipeline. If you have audio or video in your library — recipes, lectures, podcasts — search now hits the subject of each file, not just the words spoken in it. And if you’ve been on the receiving end of a bulk reprocess every time you clicked Retry, that’s gone.

Beta 4 — 2026-04-20

A pure UX polish pass. The app should feel dramatically more responsive — buttons that looked clickable but didn’t react now actually do, and hovering anywhere on a clickable element gives you visible feedback and a pointer cursor.

Clickable elements are actually clickable

Plain buttons across the app previously only registered clicks when the cursor was directly over the text or icon — the transparent padding and Spacer regions in pill-style buttons were dead zones. Now the entire visible frame of every button is hit-testable. Applies to 44 buttons across the app including chat history, font-size pills, sidebar links, Library actions, and Settings controls.

Hover feedback everywhere

Every clickable element in the app now shows a subtle highlight and the pointing-hand cursor on mouseover. No more hunting for what’s interactive. Applies to buttons, tabs, pickers, rows, and pills. Tabs (Settings and Privacy) replaced with a custom segmented picker that supports per-segment hover; other pickers (billing interval, source depth, role, cost-calculator mix, onboarding privacy tier) converted too.

Multi-select media type filters

Both Search and Ask AI now let you combine multiple media type filters. Picking “PDF” and “Slides” together shows results from both — previously selecting a second type cleared the first. Tapping the “All” chip resets the filter.

Under the hood: fileTypeFilter: FileType? replaced with Set<FileType> across the entire retrieval stack (SearchEngine, RAGEngine, QdrantClient). Qdrant filter builder emits a should OR clause for multi-type selections.

Beta 3 — 2026-04-20

Multi-provider support polish

OpenAI, Anthropic, Ollama, and Custom (OpenAI-compatible) providers all work properly now.

API key “Configured” badge updates immediately when you paste a key — previously it stayed stuck on “Not configured” until the next UI event, and the Test Connection button remained disabled.
New-generation OpenAI models supported: gpt-5-*, o1, o3, o4. These use a different request shape (max_completion_tokens instead of max_tokens, no non-default temperature); we now route correctly. o1-preview / o1-mini’s lack of support for the system role is also handled.
Default OpenAI model changed from gpt-4o (30K TPM — instantly rate-limits on a single RAG query) to gpt-5-mini (500K TPM — comfortable for real workloads).
Custom endpoint normalization — you can enter https://api.together.ai or http://localhost:11434/v1 either way; the /v1 suffix is handled consistently.
Local-LAN Ollama — http://192.168.x.x:11434, .local, and .lan hostnames are now accepted as custom endpoints. (HTTPS still required for public hosts.)

Actionable error messages

LLM and embedding errors now name the correct provider, include the provider’s actual response message, and link to the right dashboard. A real example of what you’ll now see when OpenAI rate-limits a RAG query on gpt-4o:

“OpenAI rate limit hit. Check your per-model TPM/RPM at platform.openai.com/account/limits — the default gpt-4o caps at 30K TPM, which a single RAG query can exceed. Try gpt-5-mini or gpt-4o-mini for higher throughput. Details: Rate limit reached for gpt-4o in organization org-xxx on tokens per min (TPM): Limit 30000…”

Previously the same error said “upgrade your Gemini API key at aistudio.google.com” — regardless of which provider actually failed.

Cited Sources + scroll behavior

The right-pane Cited Sources panel no longer briefly shows the previous conversation’s sources when you switch conversations — the state clears synchronously and repopulates once the new messages load. The latest answer also scrolls into view on conversation switch and after a new answer completes.

Qdrant installer hardening

First-launch download is dramatically more resilient:

Retries transient network failures automatically (3 attempts, exponential backoff).
Verifies a SHA-256 checksum before extracting — truncated or corrupted downloads are caught before tar-extract, with a clean retry signal.
Falls back to our CDN mirror at download.goldenretriever.ai if GitHub Releases is unreachable (corporate proxies, GitHub outages).
Shows a recovery panel in Settings → Vector Database with a Retry button and copyable manual-install instructions if every automated path fails.
Install failures now produce specific error messages (captive portal, TLS inspection, DNS, timeout, host unreachable) instead of a one-size-fits-all “check your internet connection”.

Fixes

In-app Documentation button (Sidebar) and Help → Golden Retriever Help menu item now open the docs site instead of a GitHub 404.
Broken Qdrant privacy link in the in-app Legal Agreement fixed. Auth0 / Okta section removed entirely — it wasn’t a live integration and all the linked agreement pages were 404ing.
Embedding errors no longer leak raw API endpoint URLs or JSON responses to the UI. Invalid API key now reads “Your API key was rejected by the embedding service. Check it in Settings → AI Model.” with the provider’s actual message appended.

What’s new

Topic-aware media search

Audio and video chunks now include a topic list — the subjects discussed or shown — alongside the literal transcription and visual description. A carbonara recipe video is findable by searching “carbonara” even when the word never appears in the speech or on screen. The model is told to be conservative (“if you are uncertain about a specific name, omit it rather than guess”) so this doesn’t manufacture false topics.

You’ll see the new [Topics] section in the chunk preview on each file’s detail pane.

Per-file Retry stays per-file

Clicking Retry or Reprocess on a single file’s detail pane now scopes the pipeline to that file’s chunks only. Previously each per-file Retry triggered a full-library sweep — so if you had 49 files and clicked Retry on one, all 49 got revisited. Now it only touches what you asked it to.

There’s also a new Reprocess from scratch action under the ··· menu in the file detail summary. It clears every chunk for that file (not just failed ones) and re-runs the pipeline — useful when a prompt or model change makes the existing description stale.

Switch to Paid tier in Settings

Settings → Cloud → Plan now lets you toggle between Free and Paid tier without going back through the wizard. When you switch to Paid, the orange “Free tier — slowed” banner disappears, the inter-batch throttle drops to zero, and concurrency goes up. The throttle banner that appears on a real 429 from Google still works — this just stops false-positive slowdown when you’ve already enabled billing.

Library layout

Side panes default to a tighter width so the file table gets the room it deserves, and the right pane stops auto-expanding when you click a file. The file table columns are click-to-sort (Name / Type / Chunks / Status), and there’s a new Type filter menu in the header next to “Failures only” so you can isolate just the videos, just the PDFs, etc.

Column widths are drag-resizable. Pane widths persist across launches in the next release.

RAG no longer cites filenames as evidence

Ask AI used to occasionally cite a file as containing information just because the filename was related to the question. The grounding prompt is now explicit: filenames are not evidence, every cited fact must literally appear in the source body. The 2_channels.csv-pretending-to-have-carbonara-ingredients hallucination from beta testers is fixed.

Under the hood

New [Topics] section in MediaDescriptionService audio + video prompts; structured response parser handles all sections cleanly
Qdrant BM25 sparse text now picks the longer of mediaDescription / textPreview so [Topics] survives the upsert
FTS5 search index is self-healing: drops drifted entries before insert (was INSERT OR IGNORE which left stale rows alive after reprocessing) and runs a sweep on every app launch
runFullPipeline(targetFileID:) parameter threads through extract / describe / embed stages and the pendingChunks query
RAGService system prompt rewritten with explicit anti-hallucination rules around filename inference and citation grounding

Fixes

Folders sidebar empty after rebuild (#148) — watched folders now persist in SQLite, recover from existing chunks if the table is empty
Failed Chunks pane now groups by file with click-through to file detail (#147)
Audio race hang when Apple’s speech service dropped a callback (continuation double-resume crash)
Per-file Retry missing stuck-in-limbo pending chunks