How can I merge pdf files so that each file begins on an odd page number?

Question

I need to merge a few dozed pdfs, and i want all of the input pdfs to start on an odd page in the output pdf.

Example: A.pdf has 3 pages, B.pdf has 4 pages. I don't want my output to have 7 pages. What I want is an 8-page pdf in which pages 1-3 are from A.pdf, page 4 is empty, and pages 5-8 are from B.pdf. How can I do this?

I know about pdftk, but I didn't find such an option in the man page.

score 7 · Accepted Answer · edited Dec 10 '14 at 11:43

7

The PyPdf library makes this sort of things easy if you're willing to write a bit of Python. Save the code below in a script called pdf-cat-even (or whatever you like), make it executable (chmod +x pdf-cat-even), and run it as a filter (./pdf-cat-even a.pdf b.pdf >concatenated.pdf). You need pyPdf ≥1.13 for the addBlankPage method.

#!/usr/bin/env python
import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0
alignment = 2           # to align on even pages
for filename in sys.argv[1:]:
    # This code is executed for every file in turn
    input = PdfFileReader(open(filename))
    for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
        # This code is executed for every input page in turn
        output.addPage(p)
        output_page_number += 1
    while output_page_number % alignment != 0:
        output.addBlankPage()
        output_page_number += 1
output.write(sys.stdout)

edited Dec 10 '14 at 11:43

muru

69,900
13
192
292

answered Feb 28 '13 at 20:14

Gilles 'SO- stop being evil'

807,993
194
1,674
2,175

Thanks, this worked for me! As i prefer to read the names of the pdfs from a file, i've modified your code slightly and posted it as a [separate answer](http://unix.stackexchange.com/a/66543/32950). – Jan Warchoł Mar 01 '13 at 12:28
@JanekWarchol If your file names don't contain shell special characters such as whitespace: `./pdf-cat-even $(cat list-of-file-names.txt) >concatenated.pdf` – Gilles 'SO- stop being evil' Mar 01 '13 at 12:53
Unfortunately they do contain whitespaces. But thanks nevertheless - i didn't realize it could be done this way. – Jan Warchoł Mar 02 '13 at 15:23
@JanekWarchol Then you can use `concatenated.pdf` – Gilles 'SO- stop being evil' Mar 04 '13 at 01:33

score 3 · Answer 2 · answered Feb 28 '13 at 16:14

3

The first step is to produce a pdf file with an empty page. You can do this easily with a lot of programs (LibreOffice/OpenOffice, inkscape, (La)TeX, scribus, etc.)

Then just include this empty page where needed:

pdftk A.pdf empty_page.pdf B.pdf output result.pdf

If you want to do this automatically with a script, you can use e.g. pdftk file.pdf dump_data | grep NumberOfPages | egrep -o '[0-9]*' to extract the page count.

answered Feb 28 '13 at 16:14

jofel

26,513
6
65
92

This feels like a bit of a hack. Though if it works, it works I suppose. – Sam Whited Feb 28 '13 at 16:18
This approach almost worked for me: i wrote a script that produced a list of pdfs with epmtyPage.pdf added where necessary, but i couldn't get pdftk to correctly parse this list if the filenames contained spaces. I've tried changing IFS value, using quotation marks but to no avail - maybe it's pdftk's fault. Anyway, [the answer using pypdf](http://unix.stackexchange.com/a/66455/32950) worked for me. – Jan Warchoł Mar 01 '13 at 12:18
@JanekWarchol Which version of pdftk did you use? At least pdftk 1.44 and newer seems to support whitespaces in filenames. – jofel Mar 09 '13 at 01:11
@jofel `pdftk --version` returns pdftk 1.44. I remember that my more-bash-savvy friends spent at least 15 minutes trying different things to get this work and gave up. – Jan Warchoł Mar 09 '13 at 08:44

Sam Whited · Answer 3 · 2013-03-01T14:31:19.093

1

You could also use LaTeX to do this (though I'm aware it's probably not what you want). Something like the following should work:

\documentclass{book}

\usepackage{pdfpages}

\begin{document}

\includepdf[pages=-]{A}
\cleardoublepage % Make sure we clear to an odd page
\includepdf[pages=-]{B} % This inserts all pages. Or you can specify specific pages, a range, or `{}` for a blank page

\end{document}

Note that \cleardoublepage only inserts a blank page with classes that are made for two sided printing (eg. book)

More options and info on pdfpages can be found on CTAN.

edited Mar 01 '13 at 14:31

answered Feb 28 '13 at 16:22

Sam Whited

433
3
7

2

To include all pages automatically, you can use `\includepdf[pages=-]{...}`. – jofel Feb 28 '13 at 16:41
@jofel Thanks, fixed the question. I think it defaults to all pages too, I just put it in there to show that it was possible to select certain pages. – Sam Whited Feb 28 '13 at 17:51
@jofel Also, `\cleardoublepage` only inserts a blank page if you're using a class made for two sided printing. I was using article which doesn't work; I fixed it and updated the question to reflect that. – Sam Whited Feb 28 '13 at 17:56
`\includepdf` includes only the first page by default (not all pages). `\documentclass[twoside]{article}` works also. – jofel Mar 01 '13 at 00:57
From what i see i'd have to explicitely write all files that have to be included, so that's not good enough for me. But thanks anyway. – Jan Warchoł Mar 01 '13 at 12:19
Ah, I see, I was under the impression that you were doing that anyways (listing them all in command line args). While you could automate this with LaTeX easily enough, the python example is a better way of doing it anyhow, so I'll leave this as is. – Sam Whited Mar 01 '13 at 19:00

score 1 · Answer 4 · edited Apr 13 '17 at 12:36

Gilles' answer worked for me, but since i have to merge many files it's more convenient if i can read their names from a text file. I've slightly modified Gilles' code to do just that, maybe it would help someone else:

#!/usr/bin/env python

# requires PyPdf library, version 1.13 or above -
# its homepage is http://pybrary.net/pyPdf/
# running: ./this-script-name file-with-pdf-list > output.pdf

import copy, sys
from pyPdf import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0

# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2

listoffiles = open(sys.argv[1]).read().splitlines()
for filename in listoffiles:
    # This code is executed for every file in turn
    input = PdfFileReader(open(filename))
    for p in [input.getPage(i) for i in range(0,input.getNumPages())]:
        # This code is executed for every input page in turn
        output.addPage(p)
        output_page_number += 1
    while output_page_number % alignment != 0:
        output.addBlankPage()
        output_page_number += 1
output.write(sys.stdout)

score 0 · Answer 5 · answered Dec 07 '19 at 21:36

Here's the code with PyPDF2 and python3

#!/usr/bin/env python


# requires PyPdf2 library, version 1.26 or above -
# its homepage is https://pythonhosted.org/PyPDF2/index.html
# running: ./this-script-name output.pdf file-with-pdf-list

import copy, sys
from PyPDF2 import PdfFileWriter, PdfFileReader
output = PdfFileWriter()
output_page_number = 0

# every new file should start on (n*alignment + 1)th page
# (with value 2 this means starting always on an odd page)
alignment = 2

for filename in sys.argv[2:]:
    # This code is executed for every file in turn
    input = PdfFileReader(open(filename, "rb"))
    output.appendPagesFromReader(input)
    output_page_number += input.getNumPages()

    while output_page_number % alignment != 0:
        output.addBlankPage()
        output_page_number += 1

output.write(open(sys.argv[1], "wb"))

How can I merge pdf files so that each file begins on an odd page number?

5 Answers5

Linked