PyPDF2 is a Python library for working with PDF files. It allows you to perform various operations on PDF documents, such as reading, merging, splitting, and modifying PDFs. Here are some common tasks you can perform using PyPDF2:
- Reading PDFs: You can use PyPDF2 to extract text and metadata from PDF files.
- Merging PDFs: PyPDF2 allows you to combine multiple PDF files into a single PDF document.
- Splitting PDFs: You can split a PDF into multiple smaller PDFs or extract specific pages.
- Rotating and cropping pages: PyPDF2 provides functionality for rotating pages and cropping page content.
- Adding watermarks and annotations: You can add text or images as watermarks, and create annotations in PDFs using PyPDF2.
- Encrypting and decrypting PDFs: PyPDF2 supports PDF encryption and decryption for security purposes.
Here’s a simple example of how to use PyPDF2 to merge two PDF files:
pythonCopy codeimport PyPDF2
# Open the PDF files you want to merge
pdf1 = open('file1.pdf', 'rb')
pdf2 = open('file2.pdf', 'rb')
# Create PdfFileReader objects for each PDF
pdf_reader1 = PyPDF2.PdfFileReader(pdf1)
pdf_reader2 = PyPDF2.PdfFileReader(pdf2)
# Create a PdfFileWriter object to hold the merged PDF
pdf_writer = PyPDF2.PdfFileWriter()
# Add the pages from the first PDF
for page_num in range(pdf_reader1.getNumPages()):
page = pdf_reader1.getPage(page_num)
pdf_writer.addPage(page)
# Add the pages from the second PDF
for page_num in range(pdf_reader2.getNumPages()):
page = pdf_reader2.getPage(page_num)
pdf_writer.addPage(page)
# Create a new PDF file to write the merged content
merged_pdf = open('merged.pdf', 'wb')
# Write the merged content to the new PDF file
pdf_writer.write(merged_pdf)
# Close the input and output PDF files
pdf1.close()
pdf2.close()
merged_pdf.close()
Please note that PyPDF2 may not support all PDF features, and it’s important to check the documentation and test your specific use case to ensure compatibility. There may also be more recent libraries or alternatives available for working with PDFs in Python, so it’s a good idea to explore other options if you have complex PDF manipulation needs.