You can use PyPdf2s
You can simply concatenate files by using the
from PyPDF2 import PdfFileMerger
pdfs = ['file1.pdf', 'file2.pdf', 'file3.pdf', 'file4.pdf']
merger = PdfFileMerger()
for pdf in pdfs:
You can pass file handles instead file paths if you want.
If you want more fine grained control of merging there is a
merge method of the
PdfMerger, which allows you to specify an insertion point in the output file, meaning you can insert the pages anywhere in the file. The
append method can be thought of as a
merge where the insertion point is the end of the file.
Here we insert the whole pdf into the output but at page 2.
If you wish to control which pages are appended from a particular file, you can use the
pages keyword argument of
merge, passing a tuple in the form
(start, stop[, step]) (like the regular
merger.append(pdf, pages=(0, 3)) # first 3 pages
merger.append(pdf, pages=(0, 6, 2)) # pages 1,3, 5
If you specify an invalid range you will get an
Note: also that to avoid files being left open, the
PdfFileMergers close method should be called when the merged file has been written. This ensures all files are closed (input and output) in a timely manner. It’s a shame that
PdfFileMerger isn’t implemented as a context manager, so we can use the
with keyword, avoid the explicit close call and get some easy exception safety.
You might also want to look at the
pdfcat script provided as part of pypdf2. You can potentially avoid the need to write code altogether.
The PyPdf2 github also includes some example code demonstrating merging.
Another library perhaps worth a look is PyMuPdf. Merging is equally simple.
From command line:
python -m fitz join -o result.pdf file1.pdf file2.pdf file3.pdf
and from code
result = fitz.open()
for pdf in ['file1.pdf', 'file2.pdf', 'file3.pdf']:
with fitz.open(pdf) as mfile:
With plenty of options, detailed in the projects wiki.