Skip to content

Extract images from a PDF — page rasterize vs embedded image extraction

Tool author & maintainerPublished Apr 26, 202612 min read

‘Extract images from PDF’ means two very different things. Mode A — rasterize each page — flattens the entire page (text + photos + diagrams) into one image per page; this is what most users want. Mode B — extract embedded images — pulls out the original photo bytes the PDF author inserted, untouched. Picking the wrong mode wastes time. This guide tells you which mode you want and the in-browser workflow for each.

What does 'extract images from PDF' actually mean?

Two completely different operations share the name. Page rasterize: every page of the PDF becomes one flat image — text, photos, diagrams all baked together. This is what you want for slide decks, scanned documents, or to feed pages into an OCR pipeline. Embedded image extract: PDF.js walks the PDF object graph and pulls out every image stream the author inserted, exactly as they were embedded. This is what you want when a designer placed a hero photo in a brochure and you want the original photo, not a screenshot of the page.

When should I rasterize each page?

Five common cases. (1) Pre-deck a slide PDF as Instagram carousel images. (2) Convert a scanned document for OCR ingestion. (3) Extract lecture notes for review on a phone. (4) Turn a contract into preview thumbnails. (5) Archive a one-page poster as a high-resolution JPG. In all five cases the entire page (text + visuals together) is what you want — embedded extraction would give you only the photos, not the layout.

When should I extract embedded images?

Three common cases. (1) The PDF author placed a high-resolution hero photo in a brochure and you want that photo at its original resolution, not a downscaled raster of the whole page. (2) Recover original-quality images from a vendor's product PDF for re-use. (3) Audit which images a PDF contains for a copyright check. The output here is N images per page, not 1; a 30-page brochure might yield 80 individual images.

What DPI should I pick?

DPI (dots-per-inch) only matters for the rasterize mode. 72 DPI matches the on-screen rendering most viewers use, smallest files. 150 DPI is the sweet spot for screen viewing on retina displays. 300 DPI hits print quality. Above 300 the size grows but the eye cannot tell the difference unless you go below A4 print size with a magnifying glass. For OCR ingestion, 200–300 DPI is standard; below 150, OCR accuracy drops sharply.

PNG vs JPG output?

PNG when the page is text-heavy or contains diagrams with fine lines — preserves edge clarity at the cost of file size. JPG quality 90 when the page is photo-heavy — drops file size by 5–10× without a visible difference at typical viewing sizes. For mixed pages (slides with both text and photos), JPG quality 92 is usually the right balance.

What about encrypted or scan-only PDFs?

Password-protected PDFs need the password decoded before rasterization; the tool prompts for it inline. Scan-only PDFs (every page is just an image) work fine — extraction returns the original scan; rasterization re-renders at the chosen DPI. Form-fillable PDFs work too; the rasterization captures both the underlying form and any filled values.

Steps

About 1 min
  1. Drop the PDF

    Drag a single PDF (up to 200 MB) onto the tool. Encrypted PDFs prompt for the password.

  2. Pick mode and settings

    Mode: 'rasterize each page' (default) or 'extract embedded images'. DPI 150 default; format JPG quality 90 default.

  3. Process

    Each page renders sequentially via PDF.js. The progress bar fills page by page; cancel and restart at any time.

  4. Download as ZIP

    All output images are bundled into a single ZIP with sequential filenames matching the page numbers.

30-page PDF rasterized via PDF.js in the browser
SettingsTime (M2)Output ZIP size
72 DPI JPG quality 905 s4 MB
150 DPI JPG quality 908 s11 MB
150 DPI PNG12 s32 MB
300 DPI JPG quality 9021 s38 MB
Measured on 14" MacBook Pro M2, Chrome 139, sample 30-page mixed text/photo brochure PDF, PDF.js 4.6 (2026-04-26).

Frequently asked questions

  • Will the tool work on a 100-page PDF?

    Yes — PDF.js streams pages so memory stays bounded regardless of total length. A 100-page PDF at 150 DPI takes roughly 25–35 seconds on a modern laptop and produces a ~30 MB ZIP of JPGs.

  • Can I extract just one specific page?

    Yes — the tool offers a page range selector. Enter '5' for just page 5, or '5-10' for pages 5 through 10. The default is 'all pages'.

  • Does increasing DPI improve OCR accuracy?

    Up to a point. 200–300 DPI is the OCR sweet spot for most documents. Below 150 DPI, OCR accuracy drops noticeably because individual letterforms become ambiguous; above 300 DPI you get diminishing returns and longer OCR processing time.

  • Can I get original photos out of a PDF?

    Yes — switch to 'extract embedded images'. The tool walks every Image XObject in the PDF and saves them at their original resolution, exactly as the PDF author embedded them.

  • Are encrypted PDFs supported?

    Yes if you have the password. The tool prompts for it inline before processing. Without the password, the PDF is simply not parseable.

  • Is anything uploaded?

    No. PDF.js parses the PDF entirely in the browser; rendering and ZIP packaging also run client-side. The PDF never leaves your device.

Try it now

Convert each PDF page to PNG or JPG

PDF to Image Converter

We measure anonymous usage with cookieless analytics. See our privacy policy.