Why can't I select the text in my scanned PDF?

Because a scanned PDF is an image — a photograph of the page — not text. There's no text layer to select, search or copy. OCR (optical character recognition) reads the shapes of the letters in the image and reconstructs them as real, selectable text.

Is online OCR safe for confidential scans?

Only if the scan isn't uploaded. Most online OCR services send your document to their servers. PDFAgent runs OCR in your browser with WebAssembly, so the scanned pages never leave your device — only the recognition engine is downloaded to you.

On a clean 300-dpi scan of printed text, modern OCR reads 95–99% of characters correctly. Accuracy drops with low resolution, skewed pages, handwriting or unusual fonts. Choosing the correct document language improves results noticeably.

How to Convert a Scanned PDF to Editable Text (OCR, Free)

A scanned PDF is just a photo of a page — you can't select or search the text. Here's how OCR turns it into real, editable text for free, right in your browser.

You scan a document, open the PDF, try to copy a sentence — and nothing selects. That’s because a scanned PDF isn’t text at all: it’s a picture of a page. To turn it into words you can edit, search and copy, you need OCR — and you can do it free, in your browser, without uploading the scan.

What OCR actually does

OCR (Optical Character Recognition) looks at the image of your page, recognizes the shape of each letter and number, and rebuilds them as real text. It’s the difference between a photo of a receipt and a receipt you can copy-paste numbers from.

This is why so many “PDF to Word” or “PDF to Excel” conversions come back empty: the source was a scan, and without OCR there was no text to extract.

How to OCR a scanned PDF

Open the OCR PDF tool.
Add your scanned PDF.
Choose the document’s language (English, Spanish, French, German and 8 more) — this is the single biggest factor in accuracy.
Click Run OCR and download the recognized text.

The first run downloads the recognition engine (about 4 MB, then cached), and the work happens on your own CPU. A few pages take seconds; a long document takes a little longer.

Get the best accuracy

OCR quality depends mostly on the input. To get clean results:

Scan at 300 dpi. Below ~200 dpi, letters blur together and accuracy drops.
Keep pages straight. Skewed scans confuse the engine — straighten before OCR if you can.
Pick the right language. A Spanish document read as English will mangle accents and common words.
Printed text beats handwriting. OCR is excellent on printed fonts; handwriting is hit-or-miss.

After OCR: get it into Word or Excel

Once you have the text, the next step depends on what you need:

Just the words? The OCR tool gives you a clean .txt you can paste anywhere.
A document to edit? Run the result through PDF to Word for an editable .docx.
A table of numbers (like a scanned invoice or statement)? Use PDF to Excel to pull it into a spreadsheet.
Already a text-based PDF (not a scan)? You don’t need OCR at all — PDF to Text extracts it directly.

The privacy difference

Scanned documents are often the most sensitive things people digitize: IDs, contracts, bank letters, medical records. Most online OCR sites upload those scans to their servers to process them — exactly the wrong thing to do with a passport scan.

The OCR PDF tool is different by design: recognition runs in your browser via WebAssembly. The pages are never uploaded; only the open-source Tesseract engine is downloaded to your machine. Your scan, your device, your text.