6

I need to convert 1K pdf files to doc on a debian server. I can convert a PDF to word using libreoffice commandline:

libreoffice --headless --invisible --convert-to doc Sample-doc-file-100kb.pdf

Or using soffice:

soffice --nocrashreport --nologo --nolockcheck --nofirststartwizard --invisible --headless --convert-to doc Sample-doc-file-100kb.pdf

The main problem with the above two commands, is that the doc file doesn't include images in the pages, it only contains the formatted text. Is there a better way to convert pdf to doc, including also the images present in the pdf? I am not interested in web services like zamzam, I need to do that from command-line on the server. Thank you.

user2972081
  • 161
  • 1
  • 1
  • 2

3 Answers3

3

You could try abiword software. e.g:

abiword --to=doc example.pdf
igiannak
  • 740
  • 1
  • 5
  • 23
3

I managed to do it by using this:

libreoffice --infilter=="writer_pdf_import" --headless \
--convert-to doc:"writer_pdf_Export" Brief.pdf  

It gives me the same output as @igiannak's answer.

Paulo Tomé
  • 3,754
  • 6
  • 26
  • 38
  • After doing this `file` detects the resultant `Brief.doc` as a pdf, and libreoffice opens it in Draw. So I don't think this really works. – naught101 Dec 12 '22 at 23:35
0

any direct command line interface command is available with pdf to docx conversion including images present in the pdf and I tried libreoofice and soffice commands it was giving only simple formatted text like any other pywin32 com clinet library is available on linux/ubuntu during pdf to word conversion

import os import sys

import comtypes.client

wdFormatPDF = 17

def covx_to_pdf(infile, outfile): """Convert a Word .docx to PDF"""

word = comtypes.client.CreateObject('Word.Application')
doc = word.Documents.Open(infile)
doc.SaveAs(outfile, FileFormat=wdFormatPDF)
doc.Close()
word.Quit()

this code is working on windows machine for pdf to word conversion but this package can not support to linux/debian platforms.can we have any suggestion for this same implementation on Linux/debian for pdf to word conversion

  • 1
    If you have a new question, please ask it by clicking the [Ask Question](https://unix.stackexchange.com/questions/ask) button. Include a link to this question if it helps provide context. - [From Review](/review/late-answers/447651) – AdminBee Jun 23 '23 at 10:03
  • @chakka krishnamurty Would you provide a piece of code then to share with us? – Nepumuk Jun 23 '23 at 19:09