Working with scans & images (OCR)

Workflows 4 min read Updated Jun 13, 2026

A huge share of real-world documents — scanned contracts, photographed receipts, faxed filings, archived patient charts — are images with no extractable text. OCR (optical character recognition) reads the text off those pages so the rest of Tholos AI can work with them. It runs entirely on your machine, like everything else.

The key thing to know: OCR isn’t a workflow you launch. There’s no “OCR mode” to switch on. It’s a transparent pre-processing step — you drop in a scan, Tholos AI notices the page has no text, and OCR runs automatically before the workflow you actually asked for.

When it kicks in

OCR runs automatically whenever you bring an image-based file into a file-based workflow:

A scanned or image-based PDF, or an image file (PNG, JPG, and similar), dropped into Document Q&A, Summarization, PII Redaction, Entity Extraction, or Contract Review.
Scanned files added to the Knowledge Base — each is OCR’d before it’s indexed, so the archive becomes searchable.

Pages that already contain selectable text skip OCR — it only runs where it’s actually needed.

What you’ll see

When OCR is needed, the file card shows an “Extracting text via OCR” indicator with page-by-page progress. Once it finishes, the workflow proceeds normally and OCR disappears from view — you simply get your answer, summary, redaction preview, or table, built from the recognized text.

What you need

OCR uses a dedicated on-device model. If you haven’t installed one, you’ll be prompted to — or you can add it any time from the Models view. It covers English, Chinese (Simplified and Traditional), Japanese, Korean, and the major European languages; for a specific language, install its OCR language data from the Models view so it doesn’t become the limiting factor. Language is auto-detected by default.

Getting the best results

Resolution matters most. Scans at 300 DPI or higher produce dramatically better results than low-resolution copies.
For phone-camera photos, shoot flat, well-lit, and in focus — fax artefacts, glare, and unusual fonts all degrade accuracy.
For a known mixed-language scan, make sure the relevant OCR language data is installed so both languages come through.

Checking the result

Spot-check a few pages: open the source scan beside the workflow’s output and confirm key fields — names, dates, amounts — match.
Be more thorough with low-quality scans (photos, faxes, unusual fonts), where recognition is most likely to slip.
For multilingual scans, confirm both languages are present — the model leans toward the dominant one by default.

Because OCR is built into every file-based workflow, the most reliable way to “use” it is simply to drop your scan into whichever workflow you need — ask questions about it, summarize it, redact it, or extract a table — and let the text extraction happen on its own.

← Back to Help Center