1

Poppler has the excellent tool pdftotext for converting a pdf file to a text file:

pdftotext input.pdf output.txt

Is there a way to re-convert this text file to pdf?

By conversion, I mean to obtain a pdf file with a similar page content as the original pdf file.

If possible, with the same page numbering as the original (but this is not mandatory). A pdf without page numbering would be also fine.

Exact looking is not important.

Some potential use-case scenarios:

  1. You have accidentally deleted your pdf file but you have that text file from pdftotext.
  2. You would like to edit the text file by a text editor and to produce an updated version of your pdf file.
  3. To produce a pdf file with smaller size.
Name
  • 173
  • 6

2 Answers2

1

There are a lot of options. Theoretically any program that can read plain text and can print can print to a virtual printer that yields a PDF.

But if I were doing it programmatically, I'd probably use pandoc:

pandoc filename.txt -o output.pdf

The default uses pdflatex to create the PDF, but if you don't want to install something as heavy as a TeX distribution, there are other backends to use like weasyprint or wkhtmltopdf:

pandoc --pdf-engine weasyprint filename.txt -o output.pdf

But of course the result is never going to preserve the formatting, fonts, etc., of the original, as already pointed out.

frabjous
  • 8,421
  • 1
  • 32
  • 33
0

Similar to the program a2ps I use a Bash function a2pdf:

a2pdf () 
{ 
    lowriter --headless --convert-to pdf "$1"
}

You surely know that with pdftotext all properties of the PDF like fonts, formatting and links are lost.

Erich
  • 335
  • 1
  • 10