PDF to Text

Pull a PDF's text layer into a plain .txt file, in batches if you want. Text-based PDFs only, no OCR on scans.

Drop PDF files here

Everything happens on your device — your files are never uploaded.

What this does

This reads the text layer that's embedded in a PDF and writes it out as a plain .txt file. It walks each page, pulls the character data that the PDF already stores, and joins it in reading order. The PDF itself is untouched; you get a separate text file alongside it.

Only real text comes through. Fonts, colors, images, and exact positioning are dropped, so a styled document becomes unstyled text. Complex multi-column layouts and tables may not keep their original structure, since the order depends on how the text is stored in the file.

How it works

1 Drop your PDF files or pick them.
2 Each PDF's text is pulled out in your browser. Nothing is uploaded.
3 Copy the text, or download each result (or all of them as a zip).

Built with open source

LiteParse — Fast WebAssembly PDF text extraction. · Apache-2.0

Frequently asked

Does my PDF get uploaded anywhere? +

No. Extraction runs entirely in your browser via WebAssembly. Your files never leave your device, and there's no sign-up.

Can it read scanned PDFs? +

It reads the real text layer of a PDF. Scanned or photographed pages are just images with no text layer, so they come back nearly empty. That would need OCR, which this tool doesn't do.

What about layout and tables? +

You get the readable text in reading order. Tricky multi-column layouts and tables may not keep their exact structure.

Can I convert several PDFs at once? +

Yes. Drop multiple files and each one is extracted into its own .txt. A single file shows a preview you can copy; a batch is packaged as a zip you download in one click.

Is there a file size or count limit? +

There's no fixed limit. Everything runs on your device, so the practical ceiling is your browser's memory. Very large or very many PDFs at once use more RAM and take longer.

What happens to images, fonts, and formatting? +

They're not included. The output is plain text with no fonts, colors, or images, and no styling or page layout is preserved.

Does it keep the PDF's metadata? +

No. The result is just the page text. Document properties like title, author, and creation date aren't carried into the .txt file.

Which browsers does it work in? +

Any modern browser that supports WebAssembly, on desktop or mobile. Nothing to install; the page loads the engine when you add a file.

Related tools

All Convert PDF →

Something went wrong

What this does

How it works

Built with open source

Frequently asked

Related tools