7

You know when you have a pdf, which is a scan of a document and it's a really huge file, because it just stores the picture of the scanned document?

And there are OCR tools which can help you to make a proper document which just stores the text?

Well, I need the reverse of that! Let's say I have a perfect pdf document generated with pdflatex and I need to turn it into such a "huge" pdf, which looks exactly the same when printed on paper (with a certain dpi value), but is just a picture of the original.

My initial idea is to turn the pdf into a series of JPGs and then back into a PDF, but perhaps there is some canonical way for that?


In case you wonder why I would want to do such a thing: I'm currently stuck with a network printer, which is not maintained by me, and which randomly drops characters in printed files! So until someone figures out what's wrong there, I want this as workaround.

Celada
  • 43,173
  • 5
  • 96
  • 105
  • Basically you want to convert Latex or PDF to JPG/PNG files. And print that. (Just a simplification of what you wrote.) – Apache Apr 26 '15 at 14:14
  • But I want to still store a document with multiple pages in a single file, so that I can use the features of my pdf viewer, e.g. print 2 document pages on one paper page and so on. I imagine this to be cumbersome with loose PNG files. – Dimitri Schachmann Apr 26 '15 at 14:23
  • GhostScript can render PDFs into PNG. – Palec Apr 26 '15 at 18:01

3 Answers3

4

You could test out if image based PDF's are polluted as well. First convert PDF to (multipage) TIFF, e.g. with ghostscript:

gs -sDEVICE=tiffg4 -o sample.tif sample.pdf

Then convert the TIFF to PDF, e.g.:

tiff2pdf -z -f -F -pA4 -o sample-img.pdf sample.tif

This result in a PDF file where the pages are images instead of text.

Alternatively, if your system supports printing of TIFF files try to print it directly.

There is also the option of pdf2ps for converting PDF to PS, which if works, would likely be preferable.

Runium
  • 28,133
  • 5
  • 50
  • 71
2

I did it the way Dimitri described in the comments by using pdf2ps and ps2pdf.

First I converted my pdf to a .ps format by using the command

pdf2ps my_file.pdf my_file.ps

And then converted it back to pdf format by

ps2pdf my_file.ps my_file.pdf

This way I got a rasterized version of the original pdf where the content is actually an image. Hope this helps.

  • Great, just a reminder that `pdf2ps` and `ps2pdf` are from the ghostscript, so probably need to install ghostscript if your system doesn't come with it. – xjlin0 Apr 06 '21 at 14:51
0

The accepted answer should cover most use cases. However, I found myself in the situation that I wanted to rasterize to a specific resolution. This answer to a similar question introduced me to the tool pdftoppm which yielded the best quality results.

A simple usage example would be

pdftoppm input.pdf output -tiff

which results in files named output-X.tif, where X corresponds to the page number of the PDF file.

AdminBee
  • 21,637
  • 21
  • 47
  • 71