Extract Part of a Single PDF Page from Bash

Question

In order to extract a part of a PDF page on a Gnu/Linux machine I use the following command:

gs -sDEVICE=pdfwrite -o out.pdf -g2300x2300 input.pdf

The -g...x... option lets me choose coordinates on the input PDF. So, here is my question:

How do I shift the coordinates so that any rectangle on the input PDF might be chosen?

and extending that question:

Is there any graphical interface that allows choosing the coordinates I want? (so far it's trial and error.)

I do not want to extract whole pages from the input PDF.

The output format should again be PDF. I am not looking for extraction of text or images.

A similar question had been asked on askubuntu.org, but the answers only deal with extracting whole pages or page ranges. I know I can do that with pdftk.

A yet more specific question similar to this on was asked here before, but remained unanswered.

On a Mac this whole affair is absolutely simple: The program preview has a function for exactly that. How do I snapshot a part of a single PDF page to output format PDF?

You can do this in a UI by importing the PDF into Inkscape. – Richard Dec 29 '21 at 16:45 — Richard, Dec 29 '21 at 16:45

score 0 · Answer 1 · answered Jan 27 '19 at 11:46

0

You could give pdfjam a try, which accepts parameters like --trim '1cm 2cm 1cm 2cm' --clip true (and more parameters the LaTeX package 'pdfpages' has) "to trim those amounts from left, bottom, right and top, respectively, of input pages", like the '--help' output for the program states.

answered Jan 27 '19 at 11:46

Jaleks

2,499
1
17
34

and then trial and error until I get the margins right? – fborchers Jan 27 '19 at 15:19
1

unfortunately, yes. By the way: here is a ghostscript answer: https://askubuntu.com/a/592206 – Jaleks Jan 27 '19 at 20:03
@jaleks, I tried the solution you linked to but it didn't crop the image when I tried it on a pdf containing svg graphics. One useful thing thing I did find out reading `info gv` is that there is a config file which, if you set the `GV.saveposFilename` variable , appends the current mouse ordinates to a file on each `z` keypress. – bu5hman Jan 29 '19 at 18:25

score -1 · Answer 2 · answered Jan 27 '19 at 13:08

-1

I use document snippets a lot but I haven't seen a GUI way to extract pdf snippets directly.

That said, a precise snippet selection can be made through Okular or with Spectacle and the result saved as png, which I know you said you didn't want, but ..... you can get back to pdfs if you run this on the directory you saved the snippets to

for i in *.png; do convert $i ${i%.*}.pdf; done

The question you referenced is not so much to do with just 'grabbing a piece of a document' but reverse engineering curves without having the base point/plot data. Different animal to the question you phrased.

answered Jan 27 '19 at 13:08

bu5hman

4,663
2
14
29

`Okular` seems a good choice to display the size of the rectangle. However, converting to `png` pixelizes the image. I do want the vector graphics. – fborchers Jan 27 '19 at 15:17
That's true, though I usually dodge the resolution issue by enlarging the image on the screen and since most of my work is processed scans I am never going to get VG quality out. So you are saying 'Preview' effectively take a viewport from a VG rendering? Interesting. – bu5hman Jan 27 '19 at 17:02
The long way round you can crop in [inkscape](https://inkscapetutorials.wordpress.com/2014/04/22/inkscape-faq-how-do-i-crop-in-inkscape/) and print the cropped image to pdf. Once you open the new pdf you can manipulate the viewported svg or copy it into another pdf. – bu5hman Jan 27 '19 at 18:32

Extract Part of a Single PDF Page from Bash

2 Answers2