Skip to content

Supported File Types

Golden Retriever indexes your local files and makes them searchable using AI. Here are all the supported formats.

TypeFormatsHow it’s indexed
PDF.pdfPages sent as images to Gemini (up to 6 pages per segment)
Documents.docx, .doc, .txt, .md, .rtfText extracted and embedded
Presentations.pptx, .pptText extracted and embedded
Spreadsheets.xlsx, .xls, .csvText extracted and embedded
Images.jpg, .png, .gif, .webp, .heicSent directly to Gemini as images
Video.mp4, .mov, .avi, .mkvSplit into segments, each sent as video to Gemini
Audio.mp3, .wav, .m4a, .aac, .flacSplit into segments, each sent as audio to Gemini

All files are processed by Google’s Gemini Embedding 2 model. Text-based files have their text extracted first, then embedded. Media files (images, video, audio) are embedded natively — Gemini understands them directly without conversion to text.

  • Video/audio segments: Files are split into segments for processing. Very long files may take several minutes.
  • PDF pages: PDFs are processed in groups of up to 6 pages at a time.
  • File size: Large files (>100MB) may take longer to process but are supported.